Video telephone interpretation system and a video telephone interpretation method

ABSTRACT

A videophone interpretation system accepts a call from a caller terminal and refers an interpreter registration table to extract the terminal number of an interpreter capable of interpreting between the language of a caller and the language of a callee and connects the caller terminal, a callee terminal and an interpreter terminal. The videophone interpretation system includes a function to communicate video and audio necessary for interpretation between the terminals. The audio of an interpreter is transmitted either to the caller or callee, which is specified by the interpreter terminal. The audio of the conversation partner is suppressed or interrupted when the audio of the interpreter is detected by an audio synthesizer, thereby providing a quick and precise interpretation service.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to a videophone interpretation system anda videophone interpretation method which provide an interpretationservice for a conversation with a videophone between persons speakingdifferent languages, and in particular, to a videophone interpretationsystem and a videophone interpretation method which provideadministration services, such as those offered by a public office, ahospital and a police station, to a foreigner who is incapable of usingthe local language, without an interpreter being present in theadministrative bodies mentioned above.

2. Description of the Related Art

In recent years, persons in remote locations converse with each other ata practical level, using a videophone, due to developments incommunications technologies. In order for persons who speak differentlanguages to effectively converse with each other, an interpreter isrequired. It is thus desired that an interpretation service with avideophone will become widely available.

In the prior art, in order to obtain an interpretation service with avideophone, a three-way call must be established by using a multipointconferencing unit offering a teleconference service between a caller whowants to have a conversation, a callee as a conversation partner, and aninterpreter who interprets between a language used by the caller and alanguage used by the callee.

FIG. 22 shows a prior art configuration whereby an interpretationservice is obtained by using a video conference service with amultipoint conferencing unit. In FIG. 22, a numeral 10 represents avideophone terminal for the caller (hereinafter referred to as a callerterminal), numeral 20 represents a videophone terminal for the callee(hereinafter referred to as a callee terminal), numeral 30 represents avideophone terminal for the interpreter (hereinafter referred to as aninterpreter terminal), numeral 50 represents a public telephone line,and numeral 1 represents a multipoint conferencing unit. Each videophoneterminal includes a camera (a) for picking up the user, a display (b)for displaying a received video, a dial pad (c) for dialing the numberof a distant party, a headset (d) including a microphone for acquiringthe voice of the user and listening to the received audio. Themultipoint conferencing unit 1 offers a videoconferencing service andincludes a function to accept a call from a reserved terminal, and tosynthesize video and audio transmitted from the terminals connected andtransmitting to each terminal the synthesized video and audio.

Next, the procedure used to obtain an interpretation service using themultipoint conferencing unit will be described. First, a caller searchesfor and calls an interpreter who is capable of interpreting between thelanguage used by the caller and that used by the callee. Next, thecalled interpreter calls the callee based on the request made by thecaller and determines a conversation date and time. When theconversation date and time is determined, the caller reservesteleconferencing at the multipoint conferencing unit 1. The caller, thecallee and the interpreter check in to the multipoint conferencing unit1 with respective videophone terminals by using the specified logininformation when the reserved date and time is reached. This beginsteleconferencing between the caller terminal 10, callee terminal 20 andthe interpretation terminal 30. On the display of each terminal, videoobtained by synthesizing the video of the caller, the video of thecallee and the video of the interpreter is displayed. To the earphone ofthe headset of each terminal, audio obtained by synthesizing the audioof the caller, the audio of the callee and the audio of the interpreteris output. Thus, the caller and the callee can have a videophoneconversation while obtaining interpretation by the interpreter.

In such a prior art videophone interpretation service using a multipointconferencing unit, it is necessary to reserve a teleconference on themultipoint conferencing unit before starting a videophone conversation,and the caller must search for an interpreter, contact the callee andhold consultation to set a videoconference in advance.

Thus, it has been difficult to apply this approach to an interpretationservice which requires immediate support, such as where a foreigner whois incapable of using the local language urgently needs to obtain anadministration service from a public office, a hospital or a policestation. The interpreter must join from the stage of prior consultationbetween the caller and the callee. This occupies the interpreter for along time such that the interpretation service cost increases.

SUMMARY OF THE INVENTION

To overcome the problems described above, preferred embodiments of theinvention provide a videophone interpretation system and a videophoneinterpretation method which eliminates the need for a caller to searchfor an interpreter and consult with a callee in advance, and which areavailable in an emergency, thereby minimizing the time required of theinterpreter and reducing the interpretation service cost.

A videophone interpretation system according to a preferred embodimentof the present invention is a system in which an interpreter interpretsa videophone conversation between a caller and a callee who speakdifferent languages, the videophone interpretation system preferablyincludes connection means for connecting a caller terminal, a calleeterminal and an interpreter terminal, and communication means forcommunicating video and audio between the terminals connected by theconnection means, wherein the connection means includes an interpreterregistration table in which at least the language types that areinterpretable by an interpreter and the terminal number of theinterpreter are registered, a function to accept a call from a callerterminal, a function to acquire the terminal number of a callee,language type of the caller and the language type of the callee from thecaller terminal for which the call was accepted, a function to extractthe terminal number of the interpreter by referencing the interpreterregistration table from the acquired language type of the caller andlanguage type of the callee, a function to call the interpreter terminalusing the extracted terminal number of the interpreter, and a functionto call the callee terminal by using the acquired terminal number of thecallee and that the communication means transmits video including atleast video from the callee terminal and an audio including at least anaudio from the interpreter terminal to the caller terminal, a functionto transmit video including at least video from the caller terminal andan audio including at least an audio from the interpreter terminal tothe callee terminal, and a function to transmit an audio including atleast an audio from the caller terminal and an audio from the calleeterminal to the interpreter terminal.

With this configuration, when a call is made from a caller terminal, theterminal number of an interpreter capable of interpreting between thelanguage of the caller and the language of the callee is extracted fromthe interpreter registration table, and the caller terminal, the calleeterminal and the interpreter terminal are automatically connected, andvideo and an audio required for interpretation are communicated. Thecaller need not previously search for an interpreter and holdconsultation with the callee, thus providing a videophone interpretationservice which is available even in an emergency. The interpreter canjoin a videophone conversation anywhere he/she may be, as long as he/shecan be called. This minimizes the time needed by the interpreter, andthus, reduces the interpretation service cost.

In the videophone interpretation system according to preferredembodiments of the present invention, the communication means preferablyincludes a function to transmit video obtained by synthesizing videofrom the callee terminal as a main window and video from the interpreterterminal as a sub window to the caller terminal, a function to transmitvideo obtained by synthesizing video from the caller terminal as a mainwindow and video from the interpreter terminal as a sub window to thecallee terminal, and a function to transmit video obtained bysynthesizing video from the caller terminal and video from the calleeterminal to the interpreter terminal.

This enables the caller and the callee to check the expression of theinterpreter in a Picture-in-Picture fashion such that it is easier tounderstand the voice of the interpreter. The interpreter can check theexpression of the caller and the expression of the callee such that aprecise interpretation is enabled.

In the videophone interpretation system according to preferredembodiments of the present invention, the communication means preferablyincludes a first audio transmission function to synthesize audio fromthe callee terminal and audio from the interpreter terminal and transmitthe result to the caller terminal, a second audio transmission functionto synthesize audio from the caller terminal and audio from theinterpreter terminal and transmit the result to the callee terminal, athird audio transmission function to synthesize audio from the callerterminal and audio from the callee terminal and transmit the result tothe interpreter terminal, and an unnecessary side audio suppressionfunction to suppress an unnecessary side audio of either audio from theinterpreter terminal supplied to the first audio transmission functionor audio from the interpreter terminal supplied to the second audiotransmission function based on a command from the interpreter terminal,wherein the first audio transmission function includes a callee audiosuppression function to suppress audio from the callee terminal whenaudio from the interpreter terminal is detected and that the secondaudio transmission function includes a caller audio suppression functionto suppress audio from the caller terminal when audio from theinterpreter terminal is detected.

In interpretations using a prior art videoconference, audio obtained bysynthesizing the audios of the three parties is transmitted to eachterminal. Thus, when a user at a terminal speaks while a user at anyother terminal is speaking, the content of the conference is difficultto understand. Thus, the interpreter waits until the completion of thespeech of the caller before interpretation, a callee waits until thecompletion of the interpretation before speech, and the interpreterwaits until the completion of the speech of the callee beforeinterpretation. Since such a procedure must be repeated in a conference,it has been difficult to perform a quick and precise interpretation.According to preferred embodiments of the present invention, theunnecessary side audio suppression function suppresses an unnecessaryside transmission of audio of the interpreter to either the caller orthe callee, based on a command from the interpreter terminal. When theaudio of the interpreter is detected, transmission of the original audioof the callee to the caller is suppressed by the callee audiosuppression function. When the audio of the interpreter is detected,transmission of the original audio of the caller to the callee issuppressed by the caller audio suppression function. With thesefunctions, the caller and the callee can understand the interpretationeven when their speech overlap that of the interpreter, therebyproviding for quick and precise videophone interpretation service.

The suppression includes a case where the level of an audio signal isreduced in order to allow hearing to some extent and a case where theaudio signal is completely turned off so as to mute the audio. Theunnecessary audio suppression function includes a case where the audioof the interpreter is transmitted selectively to either the caller orthe callee.

In the videophone interpretation system according to preferredembodiments of the present invention, the communication means preferablyincluding a first audio transmission function to selectively transmiteither audio from the callee terminal or audio from the interpreterterminal to the caller terminal, a second audio transmission function toselectively transmit either audio from the caller terminal or audio fromthe interpreter terminal to the callee terminal, a third audiotransmission function to synthesize an audio from the caller terminaland audio from the callee terminal and transmit the result to theinterpreter terminal, and an unnecessary side audio suppression functionto suppress an unnecessary side audio of either audio from theinterpreter terminal supplied to the first audio transmission functionor audio from the interpreter terminal supplied to the second audiotransmission function by a command from the interpreter terminal,wherein the first audio transmission function includes a function toturn off audio from the callee terminal and transmit audio from theinterpreter terminal when audio from the interpreter is detected andthat the second audio transmission function includes a function to turnoff audio from the caller terminal and transmit audio from theinterpreter terminal when audio from the interpreter terminal isdetected.

According to preferred embodiments of the present invention, theunnecessary side audio suppression function suppresses an unnecessaryside transmission of audio of the interpreter to either the caller orcallee, based on a command from the interpreter terminal. When audio ofthe interpreter is detected in the first audio transmission function,the original audio of the callee switches to the audio of theinterpreter. When audio of the interpreter is detected in the secondaudio transmission function, the original audio of the callee switchesto the audio of the interpreter. With these functions, the caller andthe callee can understand the interpretation even when their speechoverlap that of the interpreter, thereby providing a quick and precisevideophone interpretation service.

The unnecessary audio suppression function includes a case in which theaudio of the interpreter is transmitted selectively to either the calleror the callee.

In the videophone interpretation system according to preferredembodiments of the present invention, the communication means preferablyincludes a first audio transmission function to perform audiomultiplexing of audio from the callee terminal and audio from theinterpreter terminal and transmit the result to the caller terminal, asecond audio transmission function to perform audio multiplexing ofaudio from the caller terminal and audio from the interpreter terminaland transmit the result to the callee terminal, a third audiotransmission function to perform audio multiplexing of audio from thecaller terminal and audio from the callee terminal and transmit theresult to the interpreter terminal, and an unnecessary side audiosuppression function to suppress an unnecessary side audio of eitheraudio from the interpreter terminal supplied to the first audiotransmission function or audio from the interpreter terminal supplied tothe second audio transmission function, based on a command from theinterpreter terminal.

According to preferred embodiments of the present invention, theunnecessary side audio suppression function suppresses an unnecessaryside transmission of audio of the interpreter to either the caller orcallee, by a command from the interpreter terminal. In the first audiotransmission function, the original audio of the callee and the audio ofthe interpreter are multiplexed and the result is transmitted to thecaller. In the second audio transmission function, the original audio ofthe caller and the audio of the interpreter are multiplexed and theresult is transmitted to the callee. With these functions, the callerand the callee can understand the interpretation even when their speechoverlap that of the interpreter, thereby providing a quick and precisevideophone interpretation service.

The unnecessary side audio suppression function includes a case wherethe audio of the interpreter is selectively transmitted to either thecaller or callee.

In the videophone interpretation system according to preferredembodiments of the present invention, the communication means preferablyincludes a function to record video including video from the callerterminal, video from the callee terminal and video from the interpreterterminal and audio including audio from the caller terminal, audio fromthe callee terminal and audio from the interpreter terminal, and afunction to reproduce and transmit the recorded video and audio by arequest from a terminal.

With this configuration, video and audio from the caller, the callee andthe interpreter in an interpretation service are recorded. Since thedetails of recording can be checked by a request from a terminal, it ispossible to review the contents which were not clear at the time of theconversation or to check the details of the communications service at alater time.

Video may be recorded by recording a synthesized video of video to betransmitted to a caller terminal and video to be transmitted to a calleeterminal. By doing so, it is possible to check the video received by thecaller or callee.

Audio may be recorded by recording audio obtained by performing audiomultiplexing on audio to be transmitted to a caller terminal and audioto be transmitted to a callee terminal. By doing so, it is possible tocheck the contents in the language of the caller and in the language ofthe callee separately from a terminal equipped with an audiodemultiplexing function.

Alternatively, audio to be transmitted to a caller terminal and audio tobe transmitted to a callee terminal may be recorded separately and theaudio of a side specified by a command from a terminal may be reproducedfor transmission. By doing so, it is possible to check the contents inthe language of the caller and in the language of the callee separatelyeven from a terminal not equipped with an audio demultiplexing function.

A videophone interpretation system according to preferred embodiments ofthe present invention is a system where a videophone conversationbetween a caller and a callee using different languages is interpretedby a first interpreter who interprets the language of the callee to thelanguage of the caller and a second interpreter who interprets thelanguage of the caller into the language of the callee, the videophoneinterpretation system preferably includes connection means forconnecting a caller terminal, a callee terminal, a first interpreterterminal and a second interpreter terminal and communication means forcommunicating video and audio between the terminals connected by theconnection means, wherein the connection means includes an interpreterregistration table where at least the language types interpretable by aninterpreter and the terminal number of the interpreter are registered, afunction to accept a call from a caller terminal, a function to acquirethe terminal number of a callee, language type of the caller and thelanguage type of the callee from the caller terminal for which the callwas accepted, a function to extract the terminal number of the firstinterpreter by referencing the interpreter registration table from theacquired language type of the callee and language type of the caller, afunction to call the first interpreter by using the terminal number ofthe interpreter extracted, a function to extract the terminal number ofthe second interpreter by referencing the interpreter registration tablefrom the acquired language type of the caller and language type of thecallee, a function to call the second interpreter by using the terminalnumber of the interpreter extracted, and a function to call the calleeterminal by using the acquired terminal number of the callee, and thatthe communication means includes a function to transmit video includingat least video from the callee terminal and audio including at leastaudio from the first interpreter to the caller terminal, a function totransmit video including at least video from the caller terminal andaudio including at least audio from the second interpreter to the calleeterminal, a function to transmit audio including at least audio from thecallee terminal to the first interpreter terminal, and a function totransmit audio including at least audio from the caller terminal to thesecond interpreter terminal.

With this configuration, based on a call from the caller terminal, theterminal number of the first interpreter who interprets the language ofthe callee into the language of the caller and the terminal number ofthe second interpreter who interprets the language of the caller intothe language of the callee are extracted from the interpreterregistration table. The caller terminal, the callee terminal, the firstinterpreter terminal and the second interpreter terminal areautomatically connected and video and audio required for interpretationare communicated. The caller need not previously search for aninterpreter and conduct consultation with the callee, thus providing avideophone interpretation service which is available even in anemergency. The interpreter can join a videophone conversation anywherehe/she may be, as long as he/she can be called. This minimizes the timerequired of the interpreter and reduces the interpretation service cost.

In the videophone interpretation system according to preferredembodiments of the present invention, the communication means preferablyincludes a function to transmit video obtained by synthesizing videofrom the callee terminal as a main window and video from the firstinterpreter terminal as a sub window to the caller terminal, a functionto transmit video obtained by synthesizing video from the callerterminal as a main window and video from the second interpreter terminalas a sub window to the callee terminal, a function to transmit videoobtained by synthesizing video from the callee terminal and video fromthe caller terminal to the first interpreter terminal, and a function totransmit video obtained by synthesizing video from the caller terminaland video from the callee terminal to the second interpreter terminal.

This enables the caller and the callee to check the expressions of thefirst interpreter and the second interpreter, respectively, in aPicture-in-Picture fashion such that it is easy to understand the voiceof the interpreter. Each interpreter can check the expression of thecaller and the expression of the callee such that a preciseinterpretation is enabled.

In the videophone interpretation system according to preferredembodiments of the present invention, the communication means preferablyincludes a first audio transmission function to synthesize audio fromthe callee terminal and audio from the first interpreter terminal andtransmit the result to the caller terminal, a second audio transmissionfunction to synthesize audio from the caller terminal and audio from thesecond interpreter terminal and transmit the result to the calleeterminal, a third audio transmission function to transmit at least audiofrom the callee terminal to the first interpreter terminal, and a fourthaudio transmission function to transmit at least audio from the callerterminal to the second interpreter terminal, wherein the first audiotransmission function includes a callee audio suppression function tosuppress audio from the callee terminal when audio from the firstinterpreter terminal is detected and that the second audio transmissionfunction includes a caller audio suppression function to suppress audiofrom the caller terminal when audio from the second interpreter terminalis detected.

According to various preferred embodiments of the present invention,when the audio of the first interpreter is detected, transmission of theoriginal audio of the callee to the caller is suppressed by the calleeaudio suppression function. When the audio of the second interpreter isdetected, transmission of the original audio of the caller to the calleeis suppressed by the caller audio suppression function. With thesefunctions, the caller and the callee can understand the interpretationeven when their speech overlap that of the interpreter, therebyproviding a quick and precise videophone interpretation service.

The suppression includes a case in which the level of an audio signal isreduced in order to allow hearing to some extent and a case in which theaudio signal is turned off so as to mute the audio.

In the videophone interpretation system according to preferredembodiments of the present invention, the communication means preferablyincludes a first audio transmission function to selectively transmiteither audio from the callee terminal or audio from the firstinterpreter terminal to the caller terminal, a second audio transmissionfunction to selectively transmit either audio from the caller terminalor audio from the second interpreter terminal to the callee terminal, athird audio transmission function to transmit at least audio from thecallee terminal to the first interpreter terminal, and a fourth audiotransmission function to transmit at least audio from the callerterminal to the second interpreter terminal, wherein the first audiotransmission function includes a function to turn off audio from thecallee terminal and transmit audio from the first interpreter terminalwhen detecting audio from the first interpreter terminal and that thesecond audio transmission function includes a function to shut off audiofrom the caller terminal and transmit audio from the second interpreterterminal when detecting audio from the second interpreter terminal.

According to preferred embodiments of the present invention, when theaudio of the first interpreter is detected in the first audiotransmission function, the original audio of the callee is switched tothe audio of the first interpreter. When the audio of the secondinterpreter is detected in the second audio transmission function, theoriginal audio of the callee is switched to the audio of the secondinterpreter. With these functions, the caller and the callee canunderstand the interpretation even when their speech overlap that ofeach interpreter, thereby providing a quick and precise videophoneinterpretation service.

In the videophone interpretation system according to preferredembodiments of the present invention, the communication means preferablyincludes a first audio transmission function to perform audiomultiplexing of audio from the callee terminal and audio from the firstinterpreter terminal and transmit the result to the caller terminal, asecond audio transmission function to perform audio multiplexing ofaudio from the caller terminal and audio from the second interpreterterminal and transmit the result to the callee terminal, a third audiotransmission function to transmit at least audio from the calleeterminal to the first interpreter terminal, and a fourth audiotransmission function to transmit at least audio from the callerterminal to the second interpreter terminal.

According to preferred embodiments of the present invention, in thefirst audio transmission function, the original audio of the callee andthe audio of the first interpreter are audio multiplexed and the resultis transmitted to the caller. In the second audio transmission function,the original audio of the caller and the audio of the second interpreterare audio multiplexed and the result voice is transmitted to the callee.With these functions, the caller and the callee can understand theinterpretation even when their speech overlap that of each interpreter,thereby providing a quick and precise videophone interpretation service.

In the videophone interpretation system according to preferredembodiments of the present invention, the communication means preferablyincludes a function to record video including video from the callerterminal, video from the callee terminal, video from the firstinterpreter terminal and video from the second interpreter terminal andaudio including audio from the caller terminal, audio from the calleeterminal, audio from the first interpreter terminal and audio from thesecond interpreter terminal, and a function to reproduce and transmitthe recorded video and audio by a request from a terminal.

With this configuration, videos and audios from the caller, callee,first interpreter and second interpreter in an interpretation serviceare recorded. Since the details of recording can be checked by a requestfrom a terminal, it is possible to review the contents which were notclear at the time of the conversation or to check the details of thecommunications service at a later time.

A video may be recorded by recording a synthesized video of video to betransmitted to a caller terminal and video to be transmitted to a calleeterminal. By doing so, it is possible to check the video received by thecaller or the callee.

Audio may be recorded by recording audio obtained by performing audiomultiplexing on audio to be transmitted to a caller terminal and audioto be transmitted to a callee terminal. By doing so, it is possible tocheck the contents in the language of the caller and in the language ofthe callee separately from a terminal equipped with an audiodemultiplexing function.

Alternatively, audio to be transmitted to a caller terminal and audio tobe transmitted to a callee terminal may be recorded separately and theaudio of a side specified by a command from a terminal may be reproducedand transmitted. By doing so, it is possible to check the contents inthe language of the caller and in the language of the callee separatelyeven from a terminal not equipped with an audio demultiplexing function.

In the videophone interpretation system according to preferredembodiments of the present invention, selection information forselecting an interpreter is registered in the interpreter registrationtable and the connection means preferably includes a function to acquirethe conditions for selecting an interpreter from the caller terminal anda function to extract the terminal number of an interpreter whosatisfies the acquired selection conditions by referencing theinterpreter registration table.

This selects an interpreter who satisfies the purpose of a videophoneconversation between a caller and a callee from among the interpretersregistered in the interpreter registration table. Selectioninterpretation for selecting an interpreter includes information aboutthe sex, age, habitation, specialty, and qualification.

By registering the interpretation level of an interpreter by language inthe interpreter registration table, the user can select an interpreterwho has a desired level for an interpretation between specifiedlanguages. An interpreter can register a plurality of languages, if any,for which he/she can provide interpretation. This enables flexible andefficient selection of an interpreter.

In a videophone interpretation system via bidirectional simultaneousinterpretation, a listening comprehension level and a speaking level maybe separately registered as interpretation levels by language to beregistered in the interpreter registration table. By doing so, it ispossible to individually select a person who is suitable a firstinterpreter and another who is suitable for a second interpreter,thereby enabling flexible and efficient selection of an interpreter.

In the videophone interpretation system according to preferredembodiments of the present invention, an availability flag to indicatewhether an interpreter is available is preferably registered in theinterpreter registration table and the connection means preferablyincludes a function to refer to an availability flag in the interpreterregistration table to extract the terminal number of an availableinterpreter.

In this manner, by registering whether an interpreter is available inthe interpreter registration table, an available interpreter isautomatically selected and called. This eliminates useless calling andprovides a more flexible and efficient videophone interpretation system.

In the videophone interpretation system according to preferredembodiments of the present invention, the connection means preferablyincludes a function to generate a text message to be transmitted to eachof the terminals and the communication means includes a function totransmit the generated text message to each of the terminals.

This transmits a text message which prompts each terminal to enternecessary information when connecting a caller terminal, a calleeterminal and an interpreter terminal.

In the videophone interpretation system according to preferredembodiments of the present invention, the connection means preferablyincludes a function to generate a voice message to be transmitted toeach of the terminals and the communication means includes a function totransmit the generated voice message to each of the terminals.

This transmits a voice message to a caller terminal, a callee terminaland an interpreter terminal when the caller terminal, callee terminaland interpreter terminal are to be connected. This makes it possible toprovide a videophone interpretation service even when any of the caller,the callee and the interpreter is a visually impaired person.

In the videophone interpretation system according to preferredembodiments of the present invention, the connection means preferablyincludes a function to register a term used during a conversation basedon a command from each of the terminals and a function to extract theregistered term and generate a telop based on a command from each of theterminals and the communication means includes a function to transmitthe generated telop to each of the terminals.

In this manner, by registering a term in advance that is difficult tointerpret, it is possible to display a telop on each of the terminalsand to provide a videophone interpretation service which is quick andaccurate.

In the videophone interpretation system according to preferredembodiments of the present invention, accounting information about aninterpreter is registered in the interpreter registration table and theconnection means preferably includes a function to measures the timethat the caller terminal or callee terminal obtains an interpretationservice and a function to calculate a fee from the measured time andaccounting information registered in the interpreter registration table.

By registering the accounting information about an interpreter in theinterpreter registration table, it is possible to determine anappropriate fee for a videophone interpretation service.

The interpreter registration table may register the interpretation levelof an interpreter by language and an accounting table which specifiesthe relationship between the interpretation level and the hourly ratesmay be used to determine accounting information. By doing so, it ispossible to account an appropriate fee corresponding to the level of theinterpreter.

A videophone interpretation method according to preferred embodiments ofthe present invention is a method in which an interpreter interprets avideophone conversation between a caller and a callee who speakdifferent languages, the method using an interpreter registration tablein which at least the language types interpretable by an interpreter andthe terminal number of the interpreter are registered, wherein themethod includes steps of accepting a call from a caller terminal,acquiring the terminal number of a callee, language type of the callerand the language type of the callee from the caller terminal for whichthe call was accepted, extracting the terminal number of the interpreterby referencing the interpreter registration table from the acquiredlanguage type of the caller and language type of the callee, calling theinterpreter terminal by using the terminal number of the interpreterextracted, calling the callee terminal by using the acquired terminalnumber of the callee, transmitting video including at least video fromthe callee terminal and audio including at least audio from theinterpreter terminal to the caller terminal, transmitting videoincluding at least video from the caller terminal and audio including atleast audio from the interpreter terminal to the callee terminal, andtransmitting audio including at least audio from the caller terminal andaudio from the callee terminal to the interpreter terminal.

With this configuration, upon a call from a caller terminal, theterminal number of an interpreter capable of interpreting between thelanguage of the caller and the language of the callee is extracted fromthe interpreter registration table, and the caller terminal, the calleeterminal and the interpreter terminal are automatically connected, andvideo and audio required for interpretation are communicated. The callerneed not previously search for an interpreter and conduct consultationwith the callee, thus providing a videophone interpretation servicewhich is available even in an emergency. The interpreter can join avideophone conversation anywhere he/she may be, as long as he/she can becalled. This minimizes the time occupied by the interpreter and reducesthe interpretation service cost.

A videophone interpretation method according to preferred embodiments ofthe present invention is a method in which a videophone conversationbetween a caller and a callee using different languages is interpretedby a first interpreter who interprets the language of a callee into thelanguage of a caller and a second interpreter who interprets thelanguage of the caller into the language of the callee, the method usingan interpreter registration table in which at least the language typesinterpretable by an interpreter and terminal number of the interpreterare registered, wherein the method includes steps of accepting a callfrom a caller terminal, acquiring the terminal number of a callee,language type of the caller and the language type of the callee from thecaller terminal for which the call was accepted, extracting the terminalnumber of a first interpreter by referencing the interpreterregistration table from the acquired language type of the callee andlanguage type of the caller, calling the first interpreter terminal byusing the terminal number of the first interpreter extracted, extractingthe terminal number of a second interpreter by referencing theinterpreter registration table from the acquired language type of thecaller and language type of the callee, calling the second interpreterterminal by using the terminal number of the second interpreterextracted, calling the callee by using the acquired terminal number ofthe callee, transmitting video including at least video from the calleeterminal and audio including at least audio from the first interpreterterminal to the caller terminal, transmitting video including at leastvideo from the caller terminal and audio including at least audio fromthe second interpreter terminal to the callee terminal, transmittingaudio including at least audio from the callee terminal to the firstinterpreter terminal, and transmitting audio including at least audiofrom the caller terminal to the second interpreter terminal.

With this configuration, upon a call from a caller terminal, theterminal number of a first interpreter who interprets the language ofthe callee to the language of the caller and the terminal number of asecond interpreter who interprets the language of the caller into thelanguage of the callee are extracted. The caller terminal, the calleeterminal, the first interpreter terminal, and the second interpreterterminal are automatically connected, followed by communications ofvideo and audio required for interpretation. The caller need notpreviously search for an interpreter and conduct consultation with thecallee, thus providing a videophone interpretation service which may beavailable even in an emergency. The interpreter can join a videophoneconversation anywhere he/she may be, as long as he/she can be called.This minimizes the time occupied by the interpreter and reduces theinterpretation service cost.

Other features, elements, steps, characteristics and advantages of thepresent invention will become more apparent from the following detaileddescription of preferred embodiments thereof with reference to theattached drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a system block diagram of a videophone interpretation systemaccording to a first preferred embodiment of the present invention;

FIG. 2 shows an example of a video displayed on the screen of a terminalin the videophone interpretation system according to the first preferredembodiment of the present invention;

FIG. 3 shows an example of an interpreter registration table in thevideophone interpretation system according to the first preferredembodiment of the present invention;

FIG. 4 is a processing flowchart of the control processing of acontroller in the videophone interpretation system according to thefirst preferred embodiment of the present invention;

FIG. 5 shows an example of a screen for prompting input of the languagetype of a caller and a callee.

FIG. 6 shows an example of a screen for prompting input of interpreterselection conditions;

FIG. 7 shows an example of a screen for prompting input of the terminalnumber of a callee;

FIG. 8 is a system block diagram of a videophone interpretation systemaccording to a second preferred embodiment of the present invention;

FIG. 9 shows an example of a connection table;

FIG. 10 is a processing flowchart of the control processing of acontroller in the videophone interpretation system according to thesecond preferred embodiment of the present invention;

FIG. 11 is a system block diagram of a videophone interpretation systemaccording to a third preferred embodiment of the present invention;

FIG. 12 shows an example of video displayed on the screen of a terminalin the videophone interpretation system according to the third preferredembodiment of the present invention;

FIG. 13 shows an example of an interpreter registration table in thevideophone interpretation system according to the third preferredembodiment of the present invention;

FIG. 14 is a processing flowchart of the control processing of acontroller in the videophone interpretation system according to thethird preferred embodiment of the present invention;

FIG. 15 is a block diagram of showing an example of an audiocommunications function in the videophone interpretation systemaccording to the first preferred embodiment of the present invention;

FIG. 16 is a block diagram of showing another example of the audiocommunications function in the videophone interpretation systemaccording to the first preferred embodiment of the present invention;

FIG. 17 is a block diagram of showing an example of the audiocommunications function in the videophone interpretation systemaccording to the third preferred embodiment of the present invention;

FIG. 18 is a block diagram of showing another example of the audiocommunications function in the videophone interpretation systemaccording to the third preferred embodiment of the present invention;

FIG. 19 is a block diagram of showing an example of arecording/reproduction function in the videophone interpretation systemaccording to the first preferred embodiment of the present invention;

FIG. 20 is a block diagram of showing an example of arecording/reproduction function in the videophone interpretation systemaccording to the third preferred embodiment of the present invention;

FIG. 21 shows an example of video displayed on each terminal screen byway of the recording/reproduction function; and

FIG. 22 is a system block diagram of a videophone interpretation systemusing a videoconference service with a multipoint conferencing unit.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

FIG. 1 is a system block diagram of a videophone interpretation systemaccording to a first preferred embodiment of the invention. Thispreferred embodiment shows a system configuration example assuming thata terminal used by a caller, a callee or an interpreter is atelephone-type videophone terminal connected to a public telephone line.

In FIG. 1, a numeral 100 represents a videophone interpretation systeminstalled in an interpretation center which provides an interpretationservice. The videophone interpretation system 100 interconnects avideophone terminal used by a caller (hereinafter referred to as acaller terminal) 10, a videophone terminal used by a callee (hereinafterreferred to as a callee terminal) 20, and a videophone terminal used byan interpreter (hereinafter referred to as an interpreter terminal) 30via a public telephone line 40 in order to provides a videophoneinterpretation service in which a videophone conversation between acaller and a callee is interpreted by an interpreter.

The caller terminal 10, callee terminal 20 and interpreter terminal 30each includes a television camera (a) for capturing each user, a displayscreen (b) for displaying the received video, a dial pad (c) for inputof a number or information, and a headset (d) for audio input/output.While input/output of voice is not necessarily made using a headset, ahandset of a typical telephone set may be used.

Such a videophone terminal connected to a public line may be an ISDNvideophone terminal based on ITU-T recommendation H.320. The presentinvention may use a videophone terminal which uses a unique protocol.

The public telephone line may be of a wireless type. The videophoneterminal may be a cellular phone or a portable terminal equipped with avideophone function.

The interpretation videophone system 100 includes a caller terminal lineinterface (interface being hereinafter referred to as I/F) 120 toconnect to a caller terminal, a callee terminal line I/F 140 to connectto a callee terminal, and an interpreter terminal line I/F 160 toconnect to an interpreter terminal. To each I/F, amultiplexer/demultiplexer 122, 142, 162 for multiplexing/demultiplexinga video signal, an audio signal or a data signal, a video CODEC(coder/decoder) 124, 144, 164 for compressing/expanding a video signal,and an audio CODEC 126, 146, 166 for compressing/expanding an audiosignal are connected. Each line I/F, each multiplexer/demultiplexer, andeach video CODEC or each audio CODEC performs call control, streamingcontrol and compression/expansion of a video/audio signal in accordancewith a protocol used by each terminal.

To the video input of the caller terminal video CODEC 124, a videosynthesizer 128 for synthesizing the video output of the callee terminalvideo CODEC 144, the video output of the interpreter terminal videoCODEC 164 and the output of the caller terminal telop memory 132 areconnected. To the video input of the callee terminal video CODEC 144, avideo synthesizer 148 for synthesizing the video output of the callerterminal video CODEC 124, the video output of the interpreter terminalvideo CODEC 164, and the output of the callee terminal telop memory 152are connected.

To the video input of the interpreter terminal video CODEC 164, a videosynthesizer 168 for synthesizing the video output of the caller terminalvideo CODEC 124, the video output of the callee terminal video CODEC144, and the output of the interpreter terminal telop memory 172 areconnected.

While video display of an interpreter may be omitted on a callerterminal or a callee terminal, understanding of the voice interpreted bythe interpreter is facilitated by displaying the video of theinterpreter, such that it is preferable to be able to synthesize thevideo of an interpreter.

While video display of a caller or a callee may be omitted on aninterpreter terminal, understanding of the voice interpreted by theinterpreter is facilitated by displaying the videos, such that it ispreferable to be able to display the video of a caller or a callee.

FIG. 2 shows an example of a video displayed on the screen of eachterminal during a videophone conversation by the videophoneinterpretation system 100. FIG. 2(a) shows the screen of a callerterminal, on which a synthesized video of a callee and an interpreterobtained by the video synthesizer 128 is displayed. While the video ofthe callee is displayed as a main window and the video of theinterpreter is displayed as a sub window in a Picture-in-Picture fashionin this example, the Picture-in-Picture may display the video of theinterpreter as a main window and the video of the callee as a subwindow. Or, these videos may be displayed in equal size. FIG. 2(b) showsthe screen of a callee terminal, on which a synthesized video of acaller and an interpreter obtained by the video synthesizer 148 isdisplayed. While the video of the caller is displayed as a main windowand the video of the interpreter is displayed as a sub window in aPicture-in-Picture fashion in this example, the Picture-in-Picture maydisplay the video of the interpreter as a main window and the video ofthe caller as a sub window. Or, these videos may be displayed in equalsize. FIG. 2(c) shows the screen of an interpreter terminal, on which asynthesized video of a caller and a callee obtained by the videosynthesizer 168 is displayed.

To the audio input of the caller terminal audio CODEC 126, an audiosynthesizer 130 for synthesizing the audio output of the callee terminalaudio CODEC 146 and the audio output of the interpreter terminal audioCODEC 166 are connected. To the audio input of the callee terminal audioCODEC 146, an audio synthesizer 150 for synthesizing the audio output ofthe caller terminal audio CODEC 126 and the audio output of theinterpreter terminal audio CODEC 166 are connected.

To the audio input of the interpreter terminal audio CODEC 166, an audiosynthesizer 170 for synthesizing the audio output of the caller terminalaudio CODEC 126 and the audio output of the callee terminal audio CODEC146 are connected.

The audio output of the interpreter terminal audio CODEC 166 is input toa selector 174. Based on a command from an interpreter terminal, theaudio output is supplied to the caller terminal audio synthesizer 130 incase the interpreter interprets the language of the callee to thelanguage of a caller, and to the callee terminal audio synthesizer 150in case the interpreter interprets the language of a caller to thelanguage of the callee. As a result, the audio of the interpreter istransmitted to either the caller or the callee requiring the audio.Thus, it is possible to prevent the speech of a caller or a callee frombeing disturbed by the unnecessary voice of an interpreter, therebyproviding a smooth conversation.

The caller terminal audio synthesizer 130 is equipped with a function tosuppress an audio level from the callee terminal or switch an audio fromthe callee terminal to an audio from the interpreter terminal when anaudio from the interpreter terminal is detected. The callee terminalaudio synthesizer 150 is equipped with a function to suppress an audiolevel from the caller terminal or switch audio from the callee terminalto audio from the interpreter terminal when audio from the interpreterterminal is detected. This prevents overlapping of the audio of theinterpretation by the interpreter over the audio of the opponent partywhich causes difficulty in listening. The interpreter can simultaneouslyinterpret the speech of the speaker, thus enabling a quick and preciseinterpretation.

FIG. 15 shows specific examples of the function to switch thedestination of the interpreter audio in the selector 174 and thefunction to suppress the audio of the callee or caller in the audiosynthesizers 130, 150. As shown in FIG. 15, the audio output of theinterpreter terminal audio CODEC 166 is connected to a caller terminalaudio signal adder 190 and a callee terminal audio signal adder 193 viathe switch 174. The audio of the interpreter is supplied to either thecaller or callee by a signal from a PB detector 175. The PB detector 175detects a predetermined number for selecting a caller or a callee on thedial pad of a terminal that is pressed based on a data signal or a tonesignal included in an audio signal from the interpreter terminal, andswitches the selector 174 to the specified side. The interpreterspecifies the caller or callee as a destination of his/her voice by thedial pad before he/she interprets. Thus, the caller or the callee whoneed not listen to the audio of the interpreter does not receive theaudio of the interpreter.

The audio output of the callee terminal audio CODEC 146 is connected tothe caller terminal audio signal adder 190 via an attenuator 191, whichattenuates the audio from the callee terminal when the audio from theinterpreter is detected by the signal detector 192. The audio output ofthe caller terminal audio CODEC 126 is connected to the callee terminalaudio signal adder 193 via an attenuator 194, which attenuates the audiofrom the caller terminal when the audio of the interpreter is detectedby the signal detector 195. The signal detectors 192, 195 are set to anappropriate detection level in order to prevent the audio of theopponent party from being attenuated by mistake due to noise.

In order to ensure that the caller or the callee can hear the audio ofthe interpreter immediately after the audio of the interpreter isdetected by the signal detector 192, 195, an appropriate signal delayunit may be provided at the interpreter audio input of the audio signaladder 190, 193.

While the audio of the opponent party is attenuated by the attenuator191, 194 such that the caller or the callee can hear the original voiceof the opponent party to some extent in the background of the audio ofthe interpreter in this embodiment, a switch may be provided instead toturn off the audio of the opponent party.

FIG. 16 shows an example in which the audio of the opponent party isturned off when the audio of the interpreter is transmitted and only theaudio of the interpreter is transmitted. As shown in FIG. 16, switches197, 198 are used instead of the audio signal adders 190, 193. When theaudio of the interpreter is detected by the signal detectors 192, 195,the switches 197, 198 are turned from the audio of the opponent party tothe audio of the interpreter. The remaining configuration is the same asthat shown in FIG. 15.

In addition, in order to ensure that the caller or the callee can hearthe audio of the interpreter immediately after the audio of theinterpreter is detected by the signal detector 192, 195, an appropriatesignal delay unit may be provided at the interpreter audio input of theswitches 197, 198.

While the audio signal adder 190, 193 simply adds the audio of theinterpreter and the audio of the opponent party in the above example,audio multiplexing of two signals may be used as well. For example, if aterminal supports a stereophonic audio, a stereophonic synthesis isperformed to the audio of the opponent party as the left channel and theaudio of the interpreter as the right channel and the result signal istransmitted to a terminal, where the receiving party selects a necessaryaudio. In this configuration, it is not necessary to provide anattenuator to attenuate the audio of the opponent party in thevideophone interpretation system. The receiving party listens to theaudio while adjusting the volume balance of the right and left channelsof a headset.

While the audio of the interpreter is transmitted to either the calleror the callee as selected by the switch 174 in the above example, theaudio of the interpreter may be supplied to each of the audio signaladder 190 (or the switch 197) and the audio signal adder 193 (or theswitch 198) via an attenuator in order to attenuate an audio signal to aparty where the audio is not required based on detection by the PBdetector 175. In this manner, some of the audio of the interpreter istransmitted to the speaker using an attenuator. The speaker thus checksthat his/her speech is interpreted while he/she is speaking.

The videophone interpretation system 100 is equipped with an interpreterregistration table 112 in which the terminal number of an interpreter isregistered and includes a controller 110 connected to each of the lineI/Fs 120, 140, 160, multiplexers/demultiplexers 122, 142, 162, videosynthesizers 128, 148, 168, audio synthesizers 130, 150, 170, and telopmemories 132, 152, 172. The controller 110 provides a function toconnect a caller terminal, a callee terminal and an interpreter terminalusing a function to accept a call from a caller terminal, a function toacquire the language type of the caller and the language type of thecallee, a function to acquire the selection conditions for selecting aninterpreter, a function to extract the terminal number of theinterpreter by referencing the interpreter registration table 112 usingthe acquired language type and selection conditions, a function to callthe interpreter terminal using the terminal number of the interpreterextracted, and a function to call the callee terminal using the acquiredterminal number of the callee.

Operation of the video synthesizers 128, 148, 168 and audio synthesizers130, 150, 170 is controlled by the controller 110. A function isincluded in which the user changes the video output method or audiooutput method by pressing a predetermined number button of a dial pad ofeach terminal. The multiplexer/demultiplexer 122, 142, 162 detects thenumber button on the dial pad of each terminal that is pressed based ona data signal or a tone signal and signals the detection to thecontroller. This ensures flexibility in the usage of the system on eachterminal. For example, only necessary videos or audios are selected anddisplayed/output in accordance with the object or it is possible toreplace a main window with a sub window, or change the position of thesub window.

To the input of the audio synthesizers 128, 148, 168, a caller terminaltelop memory 132, a callee terminal telop memory 152, and an interpreterterminal telop memory 172 are connected respectively. Contents of eachtelop memory 132, 152, 172 can be set by the controller 110. With thisconfiguration, by setting a message to be displayed on each terminal tothe telop memory 132, 152, 172 and issuing a command to select a signalof the telop memory 132, 152, 172 to the audio synthesizer 128, 148, 168in the setup of a videophone conversation via interpretation, it ispossible to transmit necessary messages to respective terminals toestablish a three-way call.

If there is a term which is difficult to explain or a word that isdifficult to pronounce in a videophone conversation, it is possible toregister in advance the term in the term registration table 113 of thecontroller 110 in association with the number of the dial pad on eachterminal. By doing so, it is possible to detect that the dial pad oneach terminal is pressed during a videophone conversation by using adata signal or a tone signal on the multiplexer/demultiplexer 122, 142,162, extract a term corresponding to the number of the dial pad pressedfrom the term registration table 113, generate a text telop, and set thetext telop to each telop memory, thereby displaying the term on eachterminal. This communicates, by a text telop, to the opponent party aterm that is difficult to explain or a word that is difficult topronounce, to thus provide a quicker and more precise videophoneconversation.

Next, the connection processing by the controller 110 for establishing avideophone conversation via interpretation is described.

Prior to processing, interpreter selection information and a terminalnumber of a terminal used by each interpreter are registered in theinterpreter registration table 112 of the controller 110 from anappropriate terminal (not shown). FIG. 3 shows an example of aregistration item to be registered in the interpreter registration table112. The interpreter selection information is information for selectinga interpreter desired by a user, which includes a gender, an age,supported languages, a habitation, a specialty, and the like. For thesupported languages, the level of an interpreter is registered bylanguage to enable the user to select an interpreter of a desired levelbetween the target languages. In this example, the levels ofinterpretation are represented by 1 (Advanced), 2 (Middle) and 3(Basic). The habitation assumes a case in which the user desires aperson who has geographic knowledge on a specific area and, in thisexample, a ZIP code is used to specify an area. The specialty assumes acase in which, if the conversation pertains to a specific field, theuser desires a person who has expert knowledge on the field or isfamiliar with the topics in the field. In this example, the fields aninterpreter is familiar with are classified into several categories tobe registered, such as politics, law, business, education, science andtechnology, medical care, language, sports, and hobby. The specialtiesare diverse, such that they may be registered hierarchically andsearched through at a level desired by the user.

In addition, qualifications of the interpreter may be registered inadvance such that the user can select a qualified person as aninterpreter.

The terminal number to be registered is the telephone number of theterminal because, in this example, a videophone terminal to connect to apublic telephone line is provided.

In the interpreter registration table 112, an availability flag isprovided to indicate whether an interpreter accepts the interpretation.A registered interpreter can call the interpretation center from his/herterminal and enter a command by using a dial pad to set/reset theavailability flag. Thus, an interpreter registered in the interpreterregistration table can set the availability flag only when he/she isavailable for interpretation, thereby eliminating useless calling andenabling the user to select an available interpreter without delay.

FIG. 4 shows a processing flowchart of the connection processing by thecontroller 110. The videophone interpretation system 100 accepts anorder for an interpretation service when the caller calls a telephonenumber of the caller terminal line I/F. The videophone interpretationsystem 100 then calls the interpreter terminal and the callee terminal,and establishes a connection for the videophone interpretation service.

As shown in FIG. 4, the presence of a call to the caller terminal lineI/F 120 is detected (S100). When a call is detected, a screen to promptinput of the language type of the caller is displayed on the callerterminal (S102). This is accomplished, for example, by setting a messageshown in FIG. 5(a) to the caller terminal telop memory 132. The languagetype of the caller input by the caller is acquired (S104). Afterwards,messaging to the caller terminal and the interpreter terminal isprovided using the language type of the caller acquired. Next, a screenwhich prompts input of a language type of the callee is displayed on thecaller terminal (S106). This is accomplished, for example, by setting amessage shown in FIG. 5(b) to the caller terminal telop 132. Thelanguage type of the callee input by the caller is acquired (S108).Afterwards, messaging to the callee terminal is made using the languagetype of the callee acquired.

A screen which prompts input of interpreter selection conditions isdisplayed on the caller terminal (S110). This is accomplished, forexample, by setting a message shown in FIG. 6(a) to the caller terminaltelop memory 132. The interpreter selection conditions input by thecaller are acquired (S112). The interpreter selection conditions inputby the caller are a gender, an age bracket, an area, a specialty and aninterpretation level. The area is specified by using a ZIP code and aninterpreter is selected beginning with the habitation closest to thespecified area. If it is not necessary to specify a condition for anyselections, “N/A” may be selected.

Next, an interpreter who has a specified interpretation level of thelanguage of the caller and the language of the callee, and whose gender,age, habitation and specialty satisfy the acquired selection conditions,with his/her availability flag being set is extracted with reference tothe interpreter registration table 112, and the caller terminal displaysa list of interpreter candidates and prompts input of the selectionnumber of a desired interpreter (S114). This is accomplished, forexample, by setting a message and an interpreter list shown in FIG. 6(a)to the caller terminal telop memory 132. The hourly rates of theinterpreter (not shown) registered in the interpreter registration table112 are then extracted and displayed as a fee. This enables the user toconsider the cost of the interpretation service before selecting anappropriate interpreter. The hourly rates of the interpreter may bedetermined from the interpretation level of the selected interpreter byreferencing an accounting table which specifies the relationship betweenthe interpretation level and the hourly rates. The selection numberinput by the caller referring to the interpreter candidate list isacquired (S116). The terminal number of the selected interpreter isextracted from the interpreter registration table 112 and called (S118).Personal information about a caller, language types of the caller andcallee, and interpreter selection conditions may be communicated to theinterpreter terminal by using the interpreter terminal telop memory 172so as to accept the interpretation. Personal information about thecaller may be available for example from pre-registered memberinformation for the interpretation service being a membership service.

When a response is received from the interpreter terminal (S120), ascreen which prompts input of the terminal number of the callee isdisplayed on the caller terminal (S122). This is accomplished, forexample, by setting a message shown in FIG. 7 to the caller terminaltelop memory 132. The terminal number of the callee input by the calleris extracted and the callee is called (S124). Similar to the proceduredescribed above, personal information about a caller, language types ofthe caller and callee, and interpreter selection conditions may becommunicated to the callee terminal by using the callee terminal telopmemory 152 so as to confirm whether to accept the call and to determinewhether an error in the set conditions has occurred.

When a response is received from the callee terminal (S126), avideophone interpretation service begins (S128).

If a response is not received from the interpreter terminal in S120,whether another candidate is available is determined (S130). If anothercandidate is available, execution returns to S118 and the procedure isrepeated. If another candidate is unavailable, the caller terminal isnotified of such and the call is released (S132). If a response is notreceived from the callee terminal in S126, the caller terminal and theselected interpreter terminal are notified of such and the call isreleased (S134).

The controller 110 includes a timer (not shown) for calculating the feeof the interpretation service. The timer measures the time from when theconnection is established to when it is released. On completion of aninterpretation service, the fee is calculated based the time measured bythe timer and the hourly rates mentioned above and registered in aaccounting database 114, and charged to the user at a later time.

When the selected interpreter terminal does not accept the call, thecaller is simply notified of such and the call is released in thepreferred embodiment described above, an interpretation reservationtable to register a caller terminal number and a callee terminal numbermay be provided and the caller and the callee may be notified by a laterresponse from the selected interpreter to set a videophone conversation.

While the caller is prompted to input the language types of the callerand the callee for selection of an interpreter in this preferredembodiment, a telephone number of an interpretation center may bespecified per language type of the caller or per combination of thelanguage type of the caller and the language type of the callee in orderto acquire the language type of the caller or the callee. While thecaller is prompted to input the interpreter selection conditions forselecting an interpreter in this preferred embodiment, the caller mayfirst be prompted whether to specify the interpreter selectionconditions, and if he/she has decided not to specify the interpreterselection conditions, only the input language types may be used toselect an interpreter.

A configuration is provided where, in an emergency, the caller firstdials a specific number to automatically call an interpreter dedicatedto an emergency situation.

While the videophone interpretation system 100 includes a line I/F, amultiplexer/demultiplexer, a video CODEC, an audio CODEC, a videosynthesizer, an audio synthesizer and a controller in this preferredembodiment, these components need not be provided by individual hardware(H/W), and instead the function of each component may be provided bysoftware processing running on a computer.

While the interpreter terminal 30, similar to the caller terminal 10 andthe callee terminal 20, is located outside the interpretation center andcalled from the interpretation center over a public telephone line toprovide an interpretation service in this preferred embodiment, thepresent invention is not limited thereto, and some or all of theinterpreter terminals may be installed in the interpretation center suchthat the interpretation services are provided from the interpretationcenter.

In this preferred embodiment, an interpreter can join an interpretationservice anywhere he/she may be, as long as he/she has a terminal whichcan be connected to a public telephone line. Thus, the interpreter canprovide an interpretation service by using the availability flag to makeefficient use of free time. This enables efficient and stable operationof interpretation services which often have difficulty in securingnecessary personnel.

While a video signal of the home terminal is not input to the videosynthesizers 128, 148, 168 in this preferred embodiment, a function maybe provided to input the video signal of the home terminal, andsynthesize and display the video signal to check the video on theterminal.

While the video synthesizers 128, 148, 168 are used to synthesize videosfor each terminal in this preferred embodiment, the present invention isnot limited thereto, and videos from all terminals may be synthesized atthe same time, and the result may be transmitted to each terminal. Inthis case, as shown in FIG. 21(a) for example, a video of the caller, avideo of the callee and a video of the interpreter may be displayed in afour split screen.

While a function is provided whereby the telop memories 132, 152, 172are provided and their outputs are added to the corresponding videosynthesizers 128, 148, 168, respectively, in order to display a texttelop on each terminal in this preferred embodiment, a function may beprovided whereby telop memories to store audio information are providedand each output is added to the audio synthesizers 130, 150, 170 inorder to output an audio message on each terminal. This makes itpossible to provide a videophone interpretation service even if any ofthe caller, the callee or the interpreter is a visually impaired person.

FIG. 8 is a system block diagram of a videophone interpretation systemaccording to a second preferred embodiment of the invention. In thispreferred embodiment, the system configuration includes terminals usedby a caller, a callee and an interpreter that are IP (Internet Protocol)type videophone terminals to be connected to the Internet equipped witha web browser.

In FIG. 8, a numeral 200 represents a videophone interpretation systeminstalled in an interpretation center to provide an interpretationservice. The videophone interpretation system 200 connects a callerterminal 60 used by a caller, a callee terminal 70 used by a callee, andany of the interpreter terminals used by an interpreter 231, 232, . . .via the Internet 80 in order to provide a videophone interpretationservice to the caller and the callee.

While the caller terminal 60, the callee terminal 70 and the interpreterterminal 231, 232, . . . each includes a general-purpose processingdevice (a) such as a personal computer having a video input I/Ffunction, an audio input/output I/F function and a network connectionfunction, the processing device equipped with a keyboard (b) and a mouse(c) for input of information as well as a display (d) for displaying aweb page screen presented by a web server 210 and a videophone screensupplied by a communications server 220, a television camera (e) forcapturing the video of a each terminal user, and a headset (f) forperforming audio input/output for each terminal user, and the processingdevice has IP videophone software and a web browser installed in thisexample, a dedicated videophone terminal may be used instead.

The videophone terminal connected to the Internet may be an IPvideophone terminal based on ITU-T recommendation H.323. However, theinvention is not limited thereto, and may use a videophone terminalwhich employs a unique protocol.

The Internet may be of a wireless LAN type. The videophone terminal maybe a cellular phone or a portable terminal equipped with a videophonefunction and also including a web access function.

The videophone interpretation system 200 includes a communicationsserver 220 including a connection table 222 for setting the terminaladdresses of a caller terminal, a callee terminal and an interpreterterminal, and a function to interconnect the terminals registered in theconnection table 222 and synthesize video and audio received from eachterminal and transmit the synthesized video and audio to each terminal,a web server 210 including an interpreter registration table 212 forregistering the interpreter selection information, terminal address andavailability flag of each interpreter as described above, and a functionto select a desired interpreter based on an access from a callerterminal by using a web browser and set the terminal address of each ofthe caller terminal, the callee terminal and interpreter terminal in theconnection table 222 of the communications server 220, a router 250 forconnecting the web server 210 and the communications server 220 to theInternet, and a plurality of interpreter terminals 231, 232, . . . , 23Nconnected to the communications server 220 via a network.

FIG. 9 shows an example of a connection table 222. As shown in FIG. 9,the terminal address of a caller terminal, the terminal address of acallee terminal and the terminal address of an interpreter terminal areregistered together as a set in the connection table 222. This providesa single interpretation service. The connection table 222 is designed toregister a plurality of such terminal address sets depending on thethroughput of the communications server 220, thereby simultaneouslyproviding a plurality of interpretation services.

While the terminal address registered in the connection table 222 is anaddress on the Internet and is generally an IP address, the invention isnot limited thereto, and, for example, a name given by a directoryserver may be used.

The communications server 220 performs packet communications using apredetermined protocol with the caller terminal, the callee terminal andinterpreter terminal set to the connection table 222 and provide, by wayof software processing, the functions similar to those provided by amultiplexer/demultiplexer 122, 142, 162, a video CODEC 124, 144, 164, anaudio CODEC 126, 146, 166, a video synthesizer 128, 148, 168, an audiosynthesizer 130, 150, 170 in the videophone interpretation system 100.

With this configuration, similar to the videophone interpretation system100, prescribed videos and audios are communicated between a callerterminal, a callee terminal and an interpreter terminal, and avideophone interpretation service is provided between the caller and thecallee.

While the videophone interpretation system 100 preferably uses thecontroller 110 and the telop memories 132, 152, 172 to extract a termregistered in the term registration table 113 during a videophoneconversation by a command from a terminal and displays the term as atelop on the terminal, the same function may be provided by softwareprocessing by the communications server 220 in this preferredembodiment. A term specified by each terminal may be displayed as apopup message on the other terminal by way of the web server 210. Or, atelop memory may be provided in the communications server 220 and a termspecified by each terminal may be written into the telop memory via theweb server 210 to display a text telop on each terminal.

While the aforementioned interpretation center uses the controller 110to interconnect a caller terminal, a callee terminal and an interpreterterminal, the connection procedure is made by the web server 210 in thispreferred embodiment because each terminal has a web access function.

FIG. 10 is a processing flowchart of a connection procedure by the webserver 210. In the videophone interpretation system 200, a callerterminal may access and log into the web server 210 in theinterpretation center, which begins the acceptance of the interpretationservice.

As shown in FIG. 10, the web server 210 first acquires the terminaladdress of a caller (S200) and sets the terminal address to theconnection table 222 (S202). Next, the web server delivers a screenwhich prompts input of the language type of the caller, similar to thatshown in FIG. 5(a), (S204) to the caller terminal. The language type ofthe caller input by the caller is acquired (S206). The web serverdelivers a screen to prompt input of the language type of the callee,similar to that shown in FIG. 5(b), (S208) to the caller terminal. Thelanguage type of the callee input by the caller is acquired (S210). Theweb server delivers a screen to prompt input of the selectionconditions, similar to that shown in FIG. 6(a), to the caller terminal(S212). The interpreter selection conditions input by the caller areacquired (S214).

Next, an interpreter with an availability flag set is selected fromamong the interpreters satisfying the language type and selectionconditions referring to the interpreter registration table 212. The webserver 210 delivers a list of interpreter candidates, similar to thatshown in FIG. 6(b), to the caller terminal to prompt input of theselection number of a desired interpreter (S216). The selection numberof the interpreter input by the caller is acquired and the terminaladdress of the selected interpreter is acquired from the interpreterregistration table 212 (S218). Based on the acquired terminal address ofthe interpreter, the web server 210 delivers a calling screen to theinterpreter terminal (S220). If the call is accepted by the interpreter(S222), the terminal address of the interpreter is set by the connectiontable 222 (S224). The web server 210 delivers a screen to prompt inputof the terminal address of the callee, similar to that shown in FIG. 7,to the caller terminal (S226). The terminal address of the callee inputby the caller is acquired (S228). Based on the acquired terminal addressof the callee, the web server 210 delivers a calling screen to thecallee terminal (S230) If the call is accepted by the callee terminal(S232), the callee terminal address is set to the connection table 222(S234). Then, a videophone interpretation service begins (S236).

If the interpreter terminal does not accept the call in S222, whetheranother candidate is available is determined (S238). If anothercandidate is available, the web server delivers a message to prompt thecaller to select another candidate to the caller terminal (S240), thenexecution returns to S218. If another candidate is not found, the webserver notifies the caller terminal of such (S242) and the call isreleased. If the callee terminal does not accept the call in S232, thecaller terminal and the selected interpreter terminal are notified ofsuch (S244) and the call is released.

When the selected interpreter terminal does not accept the call, thecaller is notified of such and the call is released in this preferredembodiment. However, an interpretation reservation table to register acaller terminal address and a callee terminal address may be providedand the caller and the callee may be notified in a later response fromthe selected interpreter to set a videophone interpretation service.

While the interpreter terminal is located in the videophoneinterpretation system 200 of the interpretation center in this preferredembodiment, the present invention is not limited thereto, and some orall of the interpreters may be installed outside the interpretationcenter and connected via the Internet. These terminals may be addressedby the same processing.

In this preferred embodiment, the configuration of the videophoneinterpretation system has been described for a case in which avideophone terminal used by a caller, a callee or an interpreter is atelephone-type videophone terminal connected to a public telephone line,and a case in which the videophone terminal is an IP-type videophoneterminal connected to the Internet, the telephone-type videophoneterminal and the IP-type videophone terminal can communicate with eachother by providing a gateway to perform protocol conversiontherebetween. A videophone interpretation system conforming to oneprotocol may be provided to support a videophone terminal which usesanother protocol.

In this manner, the videophone interpretation system enables the user toreceive or provide an interpretation service anywhere he/she may be, aslong as he/she has a terminal which can be connected to a publictelephone line or the Internet. An interpreter does not always have tovisit an interpretation center, but can join a conversation viainterpretation from his/her home or a facility or site where avideophone terminal is located, or provide an interpretation service byusing a cellular phone or a portable terminal equipped with a videophonefunction.

A person with interpretation skills may wish to register in theinterpreter registration table in the interpretation center in order toprovide an interpretation service anytime when it is convenient forhim/her. From the viewpoint of the operation of the interpretationcenter, it is not necessary for the interpreters to be at the center.This enables efficient operation of the interpretation center both interms of time and costs.

While one interpreter performs both interpretation from the language ofthe callee into the language of the caller and interpretation from thelanguage of the caller into the language of the callee in this preferredembodiment, a first interpreter to interpret the language of the calleeinto the language of the caller and a second interpreter to interpretthe language of the caller into the language of the callee may beindividually provided to perform a bidirectional simultaneousinterpretation.

FIG. 11 shows an example of the system configuration of a videophoneinterpretation system which provides a bidirectional simultaneousinterpretation according to a third preferred embodiment of the presentinvention. While this example uses a telephone-type videophone, anIP-type videophone may be used as mentioned above.

In FIG. 11, a numeral 300 represents a videophone interpretation systeminstalled in an interpretation center which provides a bidirectionalsimultaneous interpretation service. The videophone interpretationsystem 300 interconnects a videophone terminal used by a caller(hereinafter referred to as a caller terminal) 10, a videophone terminalused by a callee (hereinafter referred to as a callee terminal) 20, avideophone terminal used by a first interpreter (hereinafter referred toas a first interpreter terminal) 32, and a videophone terminal used by asecond interpreter (hereinafter referred to as a second interpreterterminal) 34 via a public telephone line 40 in order to provide avideophone interpretation service in which a videophone conversationbetween a caller and a callee is interpreted by the first interpreterand the second interpreter.

The videophone interpretation system 300 includes a caller terminal lineI/F 320, a callee terminal line I/F 340, a first interpretation terminalline I/F 360 and a second interpretation terminal line I/F 380. To eachI/F, a multiplexer/demultiplexer 322, 342, 362, 382 formultiplexing/demultiplexing a video signal, an audio signal or a datasignal, a video CODEC (coder/decoder) 324, 344, 364, 384 forcompressing/expanding a video signal, and an audio CODEC 326, 346, 366,386 for compressing/expanding an audio signal are connected. Each lineI/F, each multiplexer/demultiplexer, and each video CODEC or each audioCODEC performs call control, streaming control and compression/expansionof a video/audio signal in accordance with a protocol used by eachterminal.

To the video input of the caller terminal video CODEC 324, a videosynthesizer 328 for synthesizing the video output of the callee terminalvideo CODEC 344, the video output of the first interpreter terminalvideo CODEC 364 and the output of the caller terminal telop memory 332is connected.

To the video input of the callee terminal video CODEC 344, a videosynthesizer 348 for synthesizing the video output from the callerterminal video CODEC 324, the video output from the second interpreterterminal video CODEC 384, and the output of the callee terminal telopmemory 352 is connected.

To the video input of the first interpreter terminal video CODEC 364, avideo synthesizer 368 for synthesizing the video output of the callerterminal video CODEC 324, the video output of the callee terminal videoCODEC 344, and the output of the first interpreter terminal telop memory372 is connected.

To the video input of the second interpreter terminal video CODEC 384, avideo synthesizer 388 for synthesizing the video output of the calleeterminal video CODEC 344, the video output of the caller terminal videoCODEC 324, and the output of the second interpreter terminal telopmemory 392 is connected.

While video display of a first interpreter or a second interpreter maybe omitted on a caller terminal or a callee terminal, understanding ofthe voice interpreted by the interpreter is facilitated by displayingthe video of the interpreter, such that it is preferable to be able tosynthesize the video of an interpreter.

While video display of a caller or a callee may be omitted on a firstinterpreter terminal or a second interpreter terminal, understanding ofthe voice interpreted by the interpreter is facilitated by displayingthe videos, such that it is preferable to be able to display the videoof a caller or a callee.

FIG. 12(a)-(d) show an example of video displayed on the screen of eachterminal during a videophone conversation via the videophoneinterpretation system 300. FIG. 12(a) shows the screen of a callerterminal, on which a synthesized video of a caller and a firstinterpreter obtained by the video synthesizer 328 is displayed. Whilethe video of the callee is displayed as a main window and the video ofthe first interpreter is displayed as a sub window in aPicture-in-Picture fashion in this example, the Picture-in-Picture mayalso display the video of the first interpreter as a main window and thevideo of the callee as a sub window. Or, these videos may be displayedin equal size. FIG. 12(b) shows the screen of a callee terminal, onwhich a synthesized video of a caller and a second interpreter obtainedby the video synthesizer 348 is displayed. While the video of the calleris displayed as a main window and the video of the second interpreter isdisplayed as a sub window in a Picture-in-Picture fashion in thisexample, the Picture-in-Picture may also display the video of the secondinterpreter as a main window and the video of the callee as a subwindow. Or, these videos may be displayed in equal size. FIG. 12(c)shows the screen of a first interpreter terminal, on which a synthesizedvideo of a callee and a caller obtained by the video synthesizer 368 isdisplayed. While the video of the callee is displayed as a main windowand the video of the caller is displayed as a sub window in aPicture-in-Picture fashion in this example, the videos may appear inopposite windows. Or, these videos may be displayed in equal size. FIG.12(d) shows the screen of a second interpreter terminal, on which asynthesized video of a caller and a callee obtained by the videosynthesizer 388 is displayed. While the video of the caller is displayedas a main window and the video of the callee is displayed as a subwindow in a Picture-in-Picture fashion in this example, the videos mayappear in opposite windows. Or, these videos may be displayed in equalsize.

To the audio input of the caller terminal audio CODEC 326, an audiosynthesizer 330 for synthesizing the audio output of the callee terminalaudio CODEC 346 and the audio output of the first interpreter terminalaudio CODEC 366 is connected. To the audio input of the callee terminalaudio CODEC 346, an audio synthesizer 350 for synthesizing the audiooutput of the caller terminal audio CODEC 326 and the audio output ofthe second interpreter terminal audio CODEC 386 is connected.

To the audio input of the first interpreter terminal audio CODEC 366,the audio output of the callee terminal audio CODEC 346 is connected. Tothe audio input of the second interpreter terminal audio CODEC 386, theaudio output of the caller terminal audio CODEC 326 is connected.

With this configuration, the audio of the first interpreter istransmitted only to the caller, and the audio of the second interpreteris transmitted only to the callee. Thus, the speech of the caller is notdisturbed by the audio of the second interpreter, and the speech of thecallee is not disturbed by the audio of the first interpreter, therebyproviding an effective conversation.

The caller terminal audio synthesizer 330 is equipped with a function tosuppress the audio level from the callee terminal when the audio fromthe first interpreter terminal is detected, and the callee terminalaudio synthesizer 350 is equipped with a function to suppress the audiolevel from the caller terminal when the audio from the secondinterpreter terminal is detected. This prevents overlapping of the audioof the first interpreter or the second interpreter over the audio of theopponent party which hinders listening. The first interpreter and thesecond interpreter can simultaneously interpret the speech of thespeaker, thus enabling a quick and precise interpretation.

FIG. 17 shows specific examples of the function to suppress the audio ofthe callee or caller in the audio synthesizers 330, 350. As shown inFIG. 17, the audio output of the first interpreter terminal audio CODEC366 is connected to a callee terminal audio signal adder 390. The audiooutput of the second interpreter terminal audio CODEC 386 is connectedto a callee terminal audio signal adder 393. As a result, theunnecessary voice of the second interpreter is not transmitted to thecaller and the unnecessary voice of the first interpreter is nottransmitted to the callee.

To the caller terminal audio signal adder 390, the audio output of thecallee terminal audio CODEC 346 is connected via an attenuator 391,which attenuates the audio from the callee terminal when the audio ofthe first interpreter is detected by the signal detector 392. To thecallee terminal audio signal adder 393, the audio output of the callerterminal audio CODEC 326 is connected via an attenuator 394, whichattenuates the audio from the caller terminal when the audio of thesecond interpreter is detected by the signal detector 395. The signaldetectors 392, 395 are set to an appropriate detection level in order toprevent the audio of the opponent party from being attenuated by mistakedue to noise.

In order to ensure that the caller or the callee can hear the audio ofan interpreter immediately after the audio of the interpreter isdetected by the signal detector 392, 395, an appropriate signal delayunit may be provided at the interpreter audio input of the audio signaladder 390, 393.

While the audio of the opponent party is attenuated by the attenuator391, 394 such that the caller or callee can hear the original voice ofthe opponent party to some extent in the background of the audio of thefirst interpreter or second interpreter in this preferred embodiment, aswitch may be used instead to turn off the audio of the opponent party.

FIG. 18 shows an example in which the audio of the opponent party isturned off when the audio of the interpreter is transmitted, and onlythe audio of the interpreter is transmitted. As shown in FIG. 18,switches 397, 398 are used instead of the audio signal adders 390, 393.When the audio of the interpreter is detected by the signal detectors392, 395, the switches 397, 398 are turned from the audio of theopponent party to the audio of the interpreter. The remainingconfiguration is the same as that shown in FIG. 17.

In order to ensure that the caller or the callee can hear the audio ofan interpreter immediately after the audio of the interpreter isdetected by the signal detector 392, 395, an appropriate signal delayunit may be provided at the interpreter audio input of the switch 397,398.

While the audio signal adder 390, 393 simply adds the audio of theinterpreter and the audio of the opponent party in this preferredembodiment, audio multiplexing of two signals may be used as well. Forexample, if a terminal supports a stereophonic audio, stereophonicsynthesis is performed on the audio of the opponent party as the leftchannel and the audio of the interpreter as the right channel and theresult is transmitted to a terminal, where the receiving party selects anecessary audio. In this configuration, it is not necessary to providean attenuator to attenuate the audio of the distant party in thevideophone interpretation system. The receiving party listens to theaudios while adjusting the volume balance of the right and left channelsof a headset.

While the first interpreter listens only to the audio of the callee toperform interpretation and the second interpreter listens only to theaudio of the caller to perform interpretation, a configuration may beprovided in which the audio of the caller and the audio of the secondinterpreter may be attenuated and added to or audio multiplexed into theaudio to be transmitted to the first interpreter, and also the audio ofthe callee and the audio of the first interpreter may be attenuated andadded to or audio multiplexed into the audio to be transmitted to thesecond interpreter. By doing so, each interpreter can performinterpretation while checking the progress of the entire conversationand the responses of the interpretee.

The videophone interpretation system 300 includes an interpreterregistration table 312 in which the terminal number of a terminal usedby an interpreter is registered and includes a controller 310 connectedto each of the line I/Fs 320, 340, 360, 380, multiplexers/demultiplexers322, 342, 362, 382, video synthesizers 328, 348, 368, 388, audiosynthesizers 330, 350, and telop memories 332, 352, 372, 392. Thecontroller 310 provides a function to connect a caller terminal, acallee terminal, a first interpreter terminal, and a second interpreterterminal by a function to accept a call from a caller terminal, afunction to acquire the language type of the caller and the languagetype the a callee, a function to acquire the selection conditions forselecting an interpreter, a function to extract the terminal number ofthe first interpreter and the terminal number of the second interpreterby referencing an interpreter registration table 312 by using theacquired language types and selection conditions, a function to call thefirst interpreter terminal and second interpreter terminal by using theterminal numbers of the interpreters extracted, and a function to callthe callee terminal by using the acquired terminal number of the callee.

Operation of the video synthesizers 328, 348, 368, 388 and audiosynthesizers 330, 350 is controlled by the controller 310. A function isincluded in which the user changes the video output method or audiooutput method by pressing a predetermined number button of a dial pad ofeach terminal. This is provided such that the multiplexer/demultiplexer322, 342, 362, 382 detects the number button on the dial pad of eachterminal is pressed based on a data signal or a tone signal and signalsthe detection to the controller. This ensures flexibility in the usageof the system on each terminal. For example, only necessary videos oraudios are selected and displayed/output in accordance with theobjective, or it is possible to replace a main window with a sub window,or change the position of the sub window.

To the input of the audio synthesizers 328, 348, 368, 388, a callerterminal telop memory 332, a callee terminal telop memory 352, a firstinterpreter terminal telop memory 372 and a second interpreter terminaltelop memory 392 are connected. Contents of each telop memory 332, 352,372, 392 can be set by the controller 310. With this configuration, bysetting a message to be displayed on each terminal to the telop memory332, 352, 372, 392 and issuing a command to select a signal of the telopmemory 332, 352, 372, 392 to the audio synthesizer 328, 348, 368, 388 inthe setup of a videophone conversation via interpretation, it ispossible to transmit necessary messages to respective terminals toestablish a four-way call.

If there is a term which is difficult to explain or a word which isdifficult to pronounce in a videophone conversation, it is possible toregister in advance the term in the term registration table 313 of thecontroller 310 in association with the number of the dial pad on eachterminal. By doing so, it is possible to detect that the dial pad oneach terminal is pressed during a videophone conversation by using adata signal or a tone signal on the multiplexer/demultiplexer 322, 342,362, 382, extract a term corresponding to the number of the dial padpressed from the term registration table 313, generate a text telop, andset the text telop to each telop memory, thereby displaying the term oneach terminal. This communicates, by way of a text telop, to theopponent party a term which is difficult to explain or a word which isdifficult to pronounce, thus providing a quicker and more precisevideophone conversation.

Next, the connection processing by the controller 310 for establishing avideophone conversation via bidirectional simultaneous interpretation isdescribed.

Prior to processing, interpreter selection information and a terminalnumber of a terminal used by each interpreter are registered in theinterpreter registration table 312 of the controller 310 from anappropriate terminal (not shown). FIG. 13 shows an example ofregistration item to be registered in the interpreter registration table312. As shown in FIG. 13, items registered in the interpreterregistration table 312 are same as those registered in the interpreterregistration table 112 shown in FIG. 3, except that a listeningcomprehension level and a speaking level are separately registered for asupported language. By doing so, it is possible to individually selectan optimum interpreter as a first interpreter who interprets thelanguage of the callee into the language of the caller or a secondinterpreter who interprets the language of the caller into the languageof the callee.

FIG. 14 shows a processing flowchart of the connection processing by thecontroller 310. The videophone interpretation system 300 accepts anorder for interpretation services, when the caller calls to a telephonenumber of the caller terminal line I/F. The videophone interpretationsystem 100 then calls the first interpreter terminal, second interpreterterminal, callee terminal, and establishes a connection for abidirectional simultaneous interpretation service is established.

As shown in FIG. 14, the presence of the call to the caller terminalline I/F 320 is detected (S300). When a call is detected, a screen whichprompts input of the language type of the caller, similar to that shownin FIG. 5(a), is displayed on the caller terminal (S302). The languagetype of the caller input by the caller is acquired (S304). A screenwhich prompts input of the language type of the callee similar to thatshown in FIG. 5(b) is displayed on the caller terminal (S306). Thelanguage type of the callee input by the caller is acquired (S308).Next, a screen which prompts the interpreter selection conditionssimilar to that shown in FIG. 6(a) is displayed on the caller terminal(S310). The interpreter selection conditions input by the caller areacquired (S312). In this example, the interpreter selection conditionsare, similar to the previous single interpretation, a gender, an agebracket, an area, a specialty and an interpretation level. The area isspecified by using a ZIP code and an interpreter is selected beginningwith the habitation closest to the specified area. For any selections,if it is not necessary to specify a condition, “N/A” may be selected.

Next, an interpreter who has a specified listening comprehension levelof the language of the callee and a speaking level of the language ofthe caller, and whose gender, age, habitation and specialty satisfy theacquired selection conditions, with his/her availability flag being set,is selected as a first interpreter referring to the interpreterregistration table 312 (S314). The terminal number of the selectedinterpreter is extracted and called (S316). When a response is receivedfrom the first interpreter terminal (S318), an interpreter who has aspecified listening comprehension level of the language of the callerand a speaking level of the language of the callee, and whose gender,age, habitation and specialty satisfy the acquired selection conditions,with his/her availability flag being set is selected as a secondinterpreter referring to the interpreter registration table 312 (S320).Then the terminal number of the selected interpreter is extracted andcalled (S322).

When a response is received from the second interpreter terminal (S324),a screen to prompt input of the terminal number of the callee similar tothat shown in FIG. 7 is displayed on the caller terminal (S326). Theterminal number of the callee input by the caller is extracted andcalled (S328).

When a response is received from the callee terminal (S330), avideophone interpretation service via bidirectional simultaneousinterpretation begins (S332).

If a response is not received from the first interpreter terminal inS318, whether another candidate is available is determined (S334). Ifanother candidate is available, execution returns to S314 and theprocedure is repeated. If another candidate is unavailable, the callerterminal is notified of such and the call is released (S336). If aresponse is not received from the second interpreter terminal in S324,whether another candidate is available is determined (S338). If anothercandidate is available, execution returns to S320 and the procedure isrepeated. If another candidate is unavailable, the caller terminal andthe first interpreter terminal are notified of such and the call isreleased (S340). If a response is not received from the callee terminalin S330, the caller terminal, first interpreter terminal and secondinterpreter terminal are notified of such and the call is released(S342).

While, in a step of selecting a first interpreter (S314) and a step ofselecting a second interpreter (S320), an interpreter who satisfiespredetermined conditions is selected referring to the interpreterregistration table 312 for simplicity in this preferred embodiment, aconfiguration is also possible in which, similar to the first preferredembodiment, a candidate list similar to that shown in FIG. 6(b) isdisplayed and the caller selects an interpreter from the list. In thisconfiguration, the hourly rates (not shown) of each of the firstinterpreter and second interpreter registered in the interpreterregistration table 312 may be extracted and displayed as a charge. Thisenables the user to consider the cost of the interpretation servicebefore selecting an appropriate interpreter. The hourly rates of theinterpreter may be determined from the interpretation level of theselected interpreter by referencing an accounting table which specifiesthe relationship between the interpretation level and the hourly rates.

The controller 310 includes a timer (not shown) for calculating the feeof the interpretation service. The timer measures the time from when theconnection is established to when it is released. Upon completion of aninterpretation service, the fee is calculated from the time measured bythe timer and the sum of the hourly rates of the first interpreter andthe second interpreter mentioned above and registered in a accountingdatabase 314, and charged to the user at a later time.

When the selected interpreter terminal does not accept the call, thecaller is simply notified of such and the call is released in thispreferred embodiment. However, an interpretation reservation table toregister a caller terminal number and a callee terminal number may beprovided such that the caller and the callee are notified by when alater response from both the first selected interpreter and the secondselected interpreter accept the call, then the videophone conversationservice begins.

While the videophone interpretation system 300 includes a line I/F, amultiplexer/demultiplexer, a video CODEC, an audio CODEC, a videosynthesizer, an audio synthesizer and a controller in this preferredembodiment, these components need not be provided as individual hardware(H/W), and the function of each component may be provided by softwareprocessing on a computer.

While the first interpreter terminal 32 and the second interpreterterminal 34, similar to the caller terminal 10 and the callee terminal20, is located outside the interpretation center and called from theinterpretation center over a public telephone line to provide aninterpretation service in this preferred embodiment, the invention isnot limited thereto, and some or all of the interpreter terminals may beinstalled in the interpretation center such that the interpretationservices are provided from the interpretation center.

In this preferred embodiment, an interpreter can join an interpretationservice anywhere he/she may be, as long as he/she has a terminal whichcan be connected to a public telephone line. Thus, the interpreter canprovide interpretation services by using the availability flag to makeefficient use of free time. This enables efficient and stable operate ofinterpretation services which often have difficulty in securingnecessary personnel.

While a video signal of the home terminal is not input to the videosynthesizers 328, 348, 368, 388 in the above-described preferredembodiment, a function may be provided to input the video signal of thehome terminal and synthesize and display to check the video on theterminal.

While the video synthesizers 328, 348, 368, 388 are used to synthesizevideo for each terminal in the above-described preferred embodiments,video from all terminals may be synthesized at once and the result maybe transmitted to each terminal. In this case, as shown in FIG. 21(b)for example, video of the caller, video of the callee, video of thefirst interpreter and video of the second interpreter may be displayedin a four split screen.

While a function is provided whereby the telop memories 332, 352, 372,392 are provided and their outputs are added to the corresponding videosynthesizers 328, 348, 368, 388 respectively in order to display a texttelop on each terminal in this preferred embodiment, a function may beprovided whereby telop memories to store audio information are providedand their outputs are added to the audio synthesizers 330, 350 and anaudio synthesizers is provided at the input of each of the firstinterpreter terminal audio CODEC 366 and the second interpreter terminalaudio CODEC 386, and the outputs of the corresponding telop memories areadded in order to output an audio message on each terminal. This makesit possible to provide a videophone interpretation service even if anyof the caller, the callee, the first interpreter or the secondinterpreter is a visually impaired person.

Finally, a recording/reproduction function to record video or audio in avideophone interpretation service and reproduce the audio or video andtransmit the result upon receiving a request from the user will bedescribed.

FIG. 19 shows an example of a recording/reproduction function in thevideophone interpretation system according to the first preferredembodiment. As shown in FIG. 19, video from the caller terminal videoCODEC 124, video from the callee terminal video CODEC 144, and videofrom the interpreter terminal video CODEC 164 are synthesized by thevideo synthesizer 116 and the result is transmitted to a video/audiorecorder/player 118. The audio output of the audio synthesizer 130 to betransmitted to the caller terminal and the audio output of the audiosynthesizer 150 to be transmitted to the callee terminal are audiomultiplexed by an audio multiplexer 117 in which the former is theleft-channel and the latter is the right-channel, and the result istransmitted to the video/audio recorder/player 118.

The video output of the video synthesizer 116 and the audio output ofthe audio multiplexer 117 during an interpretation service areautomatically recorded onto the video/audio recorder/player 118 andstored for each user based on a command from the controller 110. Thevideo and audio stored in the video/audio recorder/player 118 arereproduced based on a command from the controller 110 when themultiplexer/demultiplexer 122 or 142 detect a predetermined dial numberis pressed on the caller terminal or callee terminal, and the reproducedvideo and audio are transmitted to each terminal via the videosynthesizer 128 or 148 and the audio synthesizer 130 or 150 for thedetected terminal.

This allows the user to check video from each terminal during aninterpretation in a four split screen shown in FIG. 21(a). If the userterminal is equipped with an audio multiplexing/demultiplexing function,audio from each terminal can be checked, in the language of the callerin left-channel and by the language of the callee in right-channel. Theuser may call the interpretation center at a later time and input apredetermined access code from his/her terminal to reproduce and checkvideo and audio stored in the video/audio recorder/player 118.

A method for synthesizing video or audio to be recorded onto avideo/audio recorder/player is not limited to the above-describedexample, and may be any method as long as the user can check thecontents of the interpretation service. In order to support a situationin which the user terminal is not equipped with the audiomultiplexing/demultiplexing function, audio transmitted to the callerand audio transmitted to the callee may be individually recorded and theaudio specified by a terminal may be reproduced and transmitted.

The user may be a person other than the person who has obtained theinterpretation service. When a person granted access has called theinterpretation center from a videophone terminal and input an accesscode, he/she may receive video and audio stored in the video/audiorecorder/player 118.

FIG. 20 shows an example of a recording/reproduction function in thevideophone interpretation system with bidirectional simultaneousinterpretation according to the third embodiment. As shown in FIG. 20, avideo from the caller terminal video CODEC 24, a video from the calleeterminal video CODEC 344, a video from the first interpreter terminalvideo CODEC 364, and a video from the second interpreter terminal videoCODEC 384 are synthesized by the video synthesizer 316 and the result istransmitted to a video/audio recorder/player 318. The audio output ofthe audio synthesizer 330 to be transmitted to the caller terminal andthe audio output of the audio synthesizer 350 to be transmitted to thecallee terminal are audio multiplexed by an audio multiplexer 317 suchthat the former is the left-channel and the latter is the right-channel,and the result is transmitted to the video/audio recorder/player 318.

The video output of the video synthesizer 316 and the audio output ofthe audio multiplexer 317 during an interpretation service areautomatically recorded onto the video/audio recorder/player 318 andstored for each user based on a command from the controller 310. Thevideo and audio stored in the video/audio recorder/player 318 arereproduced based on a command from the controller 310 when themultiplexer/demultiplexer 322 or 342 detects a predetermined dial numberis pressed on the caller terminal or callee terminal is detected, andthe reproduced video and audio are transmitted to each terminal via thevideo synthesizer 328 or 348 and the audio synthesizer 330 or 350 forthe detected terminal.

This allows the user to check video from each terminal during aninterpretation in a four split screen shown in FIG. 21(b). If the userterminal is equipped with an audio multiplexing/demultiplexing function,audio from each terminal can be checked, in the language of the callerin left-channel and in the language of the callee in right-channel. Theuser may call the interpretation center at a later time and input apredetermined access code from his/her terminal to reproduce and check avideo and an audio stored in the video/audio recorder/player 318.

A method for synthesizing a video or audio to be recorded onto avideo/audio recorder/player is not limited to the above-describedexample, and may be any method as long as the user can check thecontents of the interpretation service. In order to support a situationin which the user terminal is not equipped with the audiomultiplexing/demultiplexing function, an audio transmitted to the callerand an audio transmitted to the callee may be individually recorded andthe audio specified by a terminal may be reproduced and transmitted.

The user may be a person other than the person who has obtained theinterpretation service. When a person granted access has called theinterpretation center from a videophone terminal and input an accesscode, he/she may receive a video and an audio stored in the video/audiorecorder/player 318.

As mentioned above, the videophone interpretation system or videophoneinterpretation method of the invention is advantageous in that a callerdoes not have to search for an interpreter in advance and conductconsultation with a callee, and in that the system and the method areavailable in an emergency, thereby minimizing the time occupied by theinterpreter to reduce the interpretation service cost.

While the present invention has been described with respect to preferredembodiments, it will be apparent to those skilled in the art that thedisclosed invention may be modified innumerous ways and may assume manyembodiments other than those specifically set out and described above.Accordingly, it is intended by the appended claims to cover allmodifications of the present invention that fall within the true spiritand scope of the invention.

1-20. (canceled)
 21. A videophone interpretation system in which aninterpreter interprets a videophone conversation between a caller and acallee who speak different languages, said videophone interpretationsystem comprising: connection means for connecting a caller terminal, acallee terminal and an interpreter terminal; and communication means forcommunicating video and audio between the caller terminal, the calleeterminal and the interpreter terminal connected by said connectionmeans; wherein said connection means includes: an interpreterregistration table in which at least the language types that can beinterpreted by an interpreter and a terminal number of the interpreterare registered; a function to accept a call from a caller terminal; afunction to acquire a terminal number of a callee, the language type ofthe caller and the language type of the callee from the caller terminalfor which said call was accepted; a function to extract a terminalnumber of an interpreter by referencing said interpreter registrationtable from the acquired language type of the caller and language type ofthe callee; a function to call the interpreter terminal by using theextracted terminal number of the interpreter; and a function to call thecallee terminal by using the acquired terminal number of the callee;said communication means includes: a function to transmit videoincluding at least video from said callee terminal to said callerterminal; a function to transmit video including at least video fromsaid caller terminal to said callee terminal; a first audio transmissionfunction to synthesize audio from said callee terminal and audio fromsaid interpreter terminal and transmit the result to said callerterminal; a second audio transmission function to synthesize audio fromsaid caller terminal and audio from said interpreter terminal andtransmit the result to said callee terminal; a third audio transmissionfunction to synthesize audio from said caller terminal and audio fromsaid callee terminal and transmit the result to said interpreterterminal; said first audio transmission function includes a callee audiosuppression function to suppress audio from said callee terminal whenaudio from said interpreter terminal is detected; said second audiotransmission function includes a caller audio suppression function tosuppress audio from said caller terminal when audio from saidinterpreter terminal is detected; a detection function to detect aselection signal for selecting either the caller terminal or the calleeterminal based on an audio signal input from said interpreter terminal;and an interpretation audio selective suppression function to suppressaudio on the side not selected by the selection signal detected by saiddetection function out of audio from the interpreter terminal suppliedto said first audio transmission function and an audio from theinterpreter terminal supplied to said second audio transmissionfunction.
 22. A videophone interpretation system in which an interpreterinterprets a videophone conversation between a caller and a callee whospeak different languages, said system comprising: connection means forconnecting a caller terminal, a callee terminal and an interpreterterminal; and communication means for communicating video and audiobetween the caller terminal, the callee terminal and the interpreterterminal connected by said connection means; wherein said connectionmeans includes: an interpreter registration table in which at least thelanguage types that can be interpreted by an interpreter and theterminal number of the interpreter are registered; a function to accepta call from a caller terminal; a function to acquire a terminal numberof a callee, the language type of the caller and the language type ofthe callee from the caller terminal for which said call was accepted; afunction to extract a terminal number of an interpreter by referencingsaid interpreter registration table from the acquired language type ofthe caller and language type of the callee; a function to call theinterpreter terminal by using the extracted terminal number of theinterpreter; and a function to call the callee terminal by using theacquired terminal number of the callee; and said communication meansincludes: a function to transmit video including at least video fromsaid callee terminal to said caller terminal; a function to transmitvideo including at least video from said caller terminal to said calleeterminal; a first audio transmission function to selectively transmiteither audio from said callee terminal or audio from said interpreterterminal to said caller terminal; a second audio transmission functionto selectively transmit either audio from said caller terminal or audiofrom said interpreter terminal to said callee terminal; a third audiotransmission function to synthesize audio from said caller terminal andaudio from said callee terminal and transmit the result to saidinterpreter terminal; said first audio transmission function includes afunction to turn off audio from said callee terminal and transmit audiofrom said interpreter terminal when audio from said interpreter terminalis detected; said second audio transmission function includes a functionto turn off audio from said caller terminal and transmit audio from saidinterpreter terminal when audio from said interpreter terminal isdetected; a detection function to detect a selection signal forselecting either the caller terminal or the callee terminal based on anaudio signal input from said interpreter terminal; and an interpretationaudio selective suppression function to suppress the audio on the sidenot selected by the selection signal detected by said detection functionout of audio from the interpreter terminal supplied to said first audiotransmission function and an audio from the interpreter terminalsupplied to said second audio transmission function.
 23. A videophoneinterpretation system in which an interpreter interprets a videophoneconversation between a caller and a callee who speak differentlanguages, said system comprising: connection means for connecting acaller terminal, a callee terminal and an interpreter terminal; andcommunication means for communicating video and audio between the callerterminal, the callee terminal and the interpreter terminal connected bysaid connection means; wherein said connection means includes: aninterpreter registration table in which at least the language types thatcan be interpreted by an interpreter and a terminal number of theinterpreter are registered; a function to accept a call from the callerterminal; a function to acquire a terminal number of a callee, thelanguage type of the caller and the language type of the callee from thecaller terminal for which said call was accepted; a function to extracta terminal number of the interpreter by referencing said interpreterregistration table from the acquired language type of the caller andlanguage type of the callee; a function to call the interpreter terminalby using the extracted terminal number of the interpreter; and afunction to call the callee terminal by using the acquired terminalnumber of the callee; and said communication means includes: a functionto transmit video including at least video from said callee terminal tosaid caller terminal; a function to transmit video including at leastvideo from said caller terminal to said callee terminal; a first audiotransmission function to perform audio multiplexing of audio from saidcallee terminal and audio from said interpreter terminal such that areceiving party will separately listen to the audio into left-channeland right-channel; a second audio transmission function to perform audiomultiplexing of audio from said caller terminal and audio from saidinterpreter terminal such that a receiving party will separately listento the audio into left-channel and right-channel; a third audiotransmission function to perform audio multiplexing of audio from saidcaller terminal and audio from said callee terminal such that areceiving party will separately listen to the audio into left-channeland right-channel; a detection function to detect a selection signal forselecting either the caller terminal or the callee terminal based on anaudio signal input from said interpreter terminal; and an interpretationaudio selective suppression function to suppress the audio on the sidenot selected by the selection signal detected by said detection functionout of audio from the interpreter terminal supplied to said first audiotransmission function and audio from the interpreter terminal suppliedto said second audio transmission function.
 24. The videophoneinterpretation system according to claim 21, wherein said communicationmeans includes: a function to transmit video obtained by synthesizingvideo from said callee terminal as a main window and video from saidinterpreter terminal as a sub window to said caller terminal; a functionto transmit video obtained by synthesizing video from said callerterminal as a main window and video from said interpreter terminal as asub window to said callee terminal; and a function to transmit videoobtained by synthesizing video from said caller terminal and video fromsaid callee terminal to said interpreter terminal.
 25. The videophoneinterpretation system according to claim 22, wherein said communicationmeans includes: a function to transmit video obtained by synthesizingvideo from said callee terminal as a main window and video from saidinterpreter terminal as a sub window to said caller terminal; a functionto transmit video obtained by synthesizing video from said callerterminal as a main window and video from said interpreter terminal as asub window to said callee terminal; and a function to transmit videoobtained by synthesizing video from said caller terminal and video fromsaid callee terminal to said interpreter terminal.
 26. The videophoneinterpretation system according to claim 23, wherein said communicationmeans includes: a function to transmit video obtained by synthesizingvideo from said callee terminal as a main window and video from saidinterpreter terminal as a sub window to said caller terminal; a functionto transmit video obtained by synthesizing video from said callerterminal as a main window and video from said interpreter terminal as asub window to said callee terminal; and a function to transmit videoobtained by synthesizing video from said caller terminal and video fromsaid callee terminal to said interpreter terminal.
 27. The videophoneinterpretation system according to claim 21, wherein said communicationmeans includes: a function to record video including video from saidcaller terminal, video from said callee terminal and video from saidinterpreter terminal and audio including audio from said callerterminal, audio from said callee terminal and audio from saidinterpreter terminal; and a function to reproduce and transmit therecorded video and audio in response to a request made by a terminal.28. The videophone interpretation system according to claim 22, whereinsaid communication means includes: a function to record video includingvideo from said caller terminal, video from said callee terminal andvideo from said interpreter terminal and audio including audio from saidcaller terminal, audio from said callee terminal and audio from saidinterpreter terminal; and a function to reproduce and transmit therecorded video and audio in response to a request made by a terminal.29. The videophone interpretation system according to claim 23, whereinsaid communication means includes: a function to record video includingvideo from said caller terminal, video from said callee terminal andvideo from said interpreter terminal and audio including audio from saidcaller terminal, audio from said callee terminal and audio from saidinterpreter terminal; and a function to reproduce and transmit therecorded video and audio in response to a request made by a terminal.30. A videophone interpretation system in which a videophoneconversation between a caller and a callee who speak different languagesis interpreted by a first interpreter who interprets the language of thecallee into the language of the caller and a second interpreter whointerprets the language of the caller into the language of the callee,said videophone interpretation system comprising: connection means forconnecting a caller terminal, a callee terminal, a first interpreterterminal and a second interpreter terminal; and communication means forcommunicating video and audio between the caller terminal, the calleeterminal, the first interpreter terminal and the second interpreterterminal connected by said connection means; wherein said connectionmeans includes: an interpreter registration table in which at least thelanguage types that can be interpreted by an interpreter and terminalnumbers of the interpreters are registered; a function to accept a callfrom a caller terminal; a function to acquire a terminal number of acallee, language type of the caller and language type of the callee fromthe caller terminal for which said call was accepted; a function toextract a terminal number of a first interpreter by referencing saidinterpreter registration table from the acquired language type of thecallee and language type of the caller; a function to call the firstinterpreter by using the extracted terminal number of the interpreter; afunction to extract a terminal number of a second interpreter byreferencing said interpreter registration table from the acquiredlanguage type of the caller and language type of the callee; a functionto call the second interpreter by using the extracted terminal number ofthe interpreter; and a function to call the callee terminal by using theacquired terminal number of the callee; and said communication meansincludes: a function to transmit video including at least video fromsaid callee terminal and audio including at least audio from said firstinterpreter to said caller terminal; a function to transmit videoincluding at least video from said caller terminal and audio includingat least audio from said second interpreter to said callee terminal; afunction to transmit audio including at least audio from said calleeterminal to said first interpreter terminal; and a function to transmitan audio including at least audio from said caller terminal to saidsecond interpreter terminal.
 31. The videophone interpretation systemaccording to claim 30, wherein said communication means includes: afunction to transmit video obtained by synthesizing video from saidcallee terminal as a main window and video from said first interpreterterminal as a sub window to said caller terminal; a function to transmitvideo obtained by synthesizing video from said caller terminal as a mainwindow and video from said second interpreter terminal as a sub windowto said callee terminal; a function to transmit video obtained bysynthesizing video from said callee terminal and video from said callerterminal to said first interpreter terminal; and a function to transmitterminal video obtained by synthesizing video from said caller terminaland video from said callee terminal to said second interpreter.
 32. Thevideophone interpretation system according to claim 30, wherein saidcommunication means includes: a first audio transmission function tosynthesize audio from said callee terminal and audio from said firstinterpreter terminal and transmit the result to said caller terminal; asecond audio transmission function to synthesize audio from said callerterminal and audio from said second interpreter terminal and transmitthe result to said callee terminal; a third audio transmission functionto transmit at least audio from said callee terminal to said firstinterpreter terminal; and a fourth audio transmission function totransmit at least audio from said caller terminal to said secondinterpreter terminal; said first audio transmission function includes acallee audio suppression function to suppress audio from said calleeterminal when audio from said first interpreter terminal is detected;and said second audio transmission function includes a caller audiosuppression function to suppress audio from said caller terminal whenaudio from said second interpreter terminal is detected.
 33. Thevideophone interpretation system according to claim 30, wherein saidcommunication means includes: a first audio transmission function toselectively transmit either audio from said callee terminal or audiofrom said first interpreter terminal to said caller terminal; a secondaudio transmission function to selectively transmit either audio fromsaid caller terminal or audio from said second interpreter terminal tosaid callee terminal; a third audio transmission function to transmit atleast audio from said callee terminal to said first interpreterterminal; and a fourth audio transmission function to transmit at leastaudio from said caller terminal to said second interpreter terminal;said first audio transmission function includes a function to turn offaudio from said callee terminal and transmit audio from said firstinterpreter terminal when detecting audio from said first interpreterterminal; and said second audio transmission function includes afunction to turn off audio from said caller terminal and transmit audiofrom said second interpreter terminal when detecting audio from saidsecond interpreter terminal.
 34. The videophone interpretation systemaccording to claim 30, wherein said communication means includes: afirst audio transmission function to perform audio multiplexing of audiofrom said callee terminal and audio from said first interpreter terminaland transmit the result to said caller terminal such that the receivingparty will listen to the audio into left-channel and right-channelseparately; a second audio transmission function to perform audiomultiplexing of audio from said caller terminal and audio from saidsecond interpreter terminal and transmit the result to said calleeterminal such that the receiving party will listen to the audio intoleft-channel and right-channel separately; a third audio transmissionfunction to transmit at least audio from said callee terminal to saidfirst interpreter terminal; and a fourth audio transmission function totransmit at least audio from said caller terminal to said secondinterpreter terminal.
 35. The videophone interpretation system accordingto claim 30, wherein said communication means includes: a function torecord video including video from said caller terminal, video from saidcallee terminal, video from said first interpreter terminal and videofrom said second interpreter terminal and audio including audio fromsaid caller terminal, audio from said callee terminal, audio from saidfirst interpreter terminal and audio from said second interpreterterminal; and a function to reproduce and transmit the recorded videoand audio in response to a request made by a terminal.
 36. Thevideophone interpretation system according to claim 21, whereinselection information for selecting an interpreter is registered in saidinterpreter registration table; and said connection means includes afunction to acquire conditions for selecting an interpreter from saidcaller terminal and a function to extract the terminal number of aninterpreter who satisfies said acquired selection conditions byreferencing said interpreter registration table.
 37. The videophoneinterpretation system according to claim 22, wherein selectioninformation for selecting an interpreter is registered in saidinterpreter registration table; and said connection means includes afunction to acquire conditions for selecting an interpreter from saidcaller terminal and a function to extract the terminal number of aninterpreter who satisfies said acquired selection conditions byreferencing said interpreter registration table.
 38. The videophoneinterpretation system according to claim 23, wherein selectioninformation for selecting an interpreter is registered in saidinterpreter registration table; and said connection means includes afunction to acquire conditions for selecting an interpreter from saidcaller terminal and a function to extract the terminal number of aninterpreter who satisfies said acquired selection conditions byreferencing said interpreter registration table.
 39. The videophoneinterpretation system according to claim 30, wherein selectioninformation for selecting an interpreter is registered in saidinterpreter registration table; and said connection means includes afunction to acquire conditions for selecting an interpreter from saidcaller terminal and a function to extract the terminal number of aninterpreter who satisfies said acquired selection conditions byreferencing said interpreter registration table.
 40. The videophoneinterpretation system according to claim 21, wherein an availabilityflag to indicate whether an interpreter is available is registered insaid interpreter registration table; and said connection means includesa function to reference an availability flag in said interpreterregistration table to extract the terminal number of an availableinterpreter.
 41. The videophone interpretation system according to claim22 wherein an availability flag to indicate whether an interpreter isavailable is registered in said interpreter registration table; and saidconnection means includes a function to reference an availability flagin said interpreter registration table to extract the terminal number ofan available interpreter.
 42. The videophone interpretation systemaccording to claim 23, wherein an availability flag to indicate whetheran interpreter is available is registered in said interpreterregistration table; and said connection means includes a function toreference an availability flag in said interpreter registration table toextract the terminal number of an available interpreter.
 43. Thevideophone interpretation system according to claim 30, wherein anavailability flag to indicate whether an interpreter is available isregistered in said interpreter registration table; and said connectionmeans includes a function to reference an availability flag in saidinterpreter registration table to extract the terminal number of anavailable interpreter.
 44. The videophone interpretation systemaccording to claim 21, wherein said connection means includes a functionto generate a text message to be transmitted to each of said terminals;and said communication means includes a function to transmit thegenerated text message to each of said terminals.
 45. The videophoneinterpretation system according to claim 22, wherein said connectionmeans includes a function to generate a text message to be transmittedto each of said terminals; and said communication means includes afunction to transmit the generated text message to each of saidterminals.
 46. The videophone interpretation system according to claim23, wherein said connection means includes a function to generate a textmessage to be transmitted to each of said terminals; and saidcommunication means includes a function to transmit the generated textmessage to each of said terminals.
 47. The videophone interpretationsystem according to claim 30, wherein said connection means includes afunction to generate a text message to be transmitted to each of saidterminals; and said communication means includes a function to transmitthe generated text message to each of said terminals.
 48. The videophoneinterpretation system according to claim 21, wherein said connectionmeans includes a function to generate a voice message to be transmittedto each of said terminals; and said communication means includes afunction to transmit the generated voice message to each of saidterminals.
 49. The videophone interpretation system according to claim22, wherein said connection means includes a function to generate avoice message to be transmitted to each of said terminals; and saidcommunication means includes a function to transmit the generated voicemessage to each of said terminals.
 50. The videophone interpretationsystem according to claim 23, wherein said connection means includes afunction to generate a voice message to be transmitted to each of saidterminals; and said communication means includes a function to transmitthe generated voice message to each of said terminals.
 51. Thevideophone interpretation system according to claim 30, wherein saidconnection means includes a function to generate a voice message to betransmitted to each of said terminals; and said communication meansincludes a function to transmit the generated voice message to each ofsaid terminals.
 52. A videophone interpretation system according toclaim 21, wherein said connection means includes a function to registera term used during a conversation based on a command from each of saidterminals and a function to extract the registered term and generate atelop based on a command from each of said terminals; and saidcommunication means includes a function to transmit the generated telopto each of said terminals.
 53. A videophone interpretation systemaccording to claim 22, wherein said connection means includes a functionto register a term used during a conversation based on a command fromeach of said terminals and a function to extract the registered term andgenerate a telop based on a command from each of said terminals; andsaid communication means includes a function to transmit the generatedtelop to each of said terminals.
 54. A videophone interpretation systemaccording to claim 23, wherein said connection means includes a functionto register a term used during a conversation based on a command fromeach of said terminals and a function to extract the registered term andgenerate a telop based on a command from each of said terminals; andsaid communication means includes a function to transmit the generatedtelop to each of said terminals.
 55. A videophone interpretation systemaccording to claim 30, wherein said connection means includes a functionto register a term used during a conversation based on a command fromeach of said terminals and a function to extract the registered term andgenerate a telop based on a command from each of said terminals; andsaid communication means includes a function to transmit the generatedtelop to each of said terminals.
 56. A videophone interpretation systemaccording to claim 21, wherein accounting information about aninterpreter is registered in said interpreter registration table, andsaid connection means includes a function to measure the time that saidcaller terminal or callee terminal obtains an interpretation service anda function to calculate a fee from the measured time and accountinginformation registered in said interpreter registration table.
 57. Avideophone interpretation method in which an interpreter interprets avideophone conversation between a caller and a callee who speakdifferent languages, said method using an interpreter registration tablein which at least the language types that can be interpreted by aninterpreter and a terminal number of the interpreter are registered,said method comprising: a step of accepting a call from a callerterminal; a step of acquiring a terminal number of a callee, thelanguage type of the caller and the language type of the callee from thecaller terminal for which said call was accepted; a step of extracting aterminal number of an interpreter by referencing said interpreterregistration table from the acquired language type of the caller andlanguage type of the callee; a step of calling the interpreter terminalby using the extracted terminal number of the interpreter; a step ofcalling the callee terminal by using the acquired terminal number of thecallee; a step of transmitting video including at least video from saidcallee terminal to said caller terminal; a step of transmitting videoincluding at least video from said caller terminal to said calleeterminal; a first audio transmission step of synthesizing audio fromsaid callee terminal and audio from said interpreter terminal andtransmitting the result to said caller terminal; a second audiotransmission step of synthesizing audio from said caller terminal andaudio from said interpreter terminal and transmitting the result to saidcallee terminal; and a third audio transmission step of synthesizingaudio from said caller terminal and audio from said callee terminal andtransmitting the result to said interpreter terminal; said first audiotransmission step including a callee audio suppression step ofsuppressing audio from said callee terminal when audio from saidinterpreter terminal is detected; said second audio transmission stepincluding a caller audio suppression step of suppressing audio from saidcaller terminal when audio from said interpreter terminal is detected; adetection step of detecting a selection signal for selecting either thecaller terminal or the callee terminal based on an audio signal inputfrom said interpreter terminal; and an interpretation audio selectivesuppression step of suppressing audio on the side not selected by theselection signal detected by said detection step out of audio from theinterpreter terminal supplied to said first audio transmission step andaudio from the interpreter terminal supplied to said second audiotransmission step.
 58. A videophone interpretation method in which avideophone conversation between a caller and a callee who speakdifferent languages is interpreted by a first interpreter who interpretslanguage of a callee into the language of a caller, and a secondinterpreter who interprets the language of the caller into the languageof the callee, said method using an interpreter registration table whereat least the language types interpretable by an interpreter and terminalnumber of the interpreter are registered, said method comprising: a stepof accepting a call from a caller terminal; a step of acquiring aterminal number of a callee, the language type of the caller and thelanguage type of the callee from the callee terminal for which said callwas accepted; a step of extracting a terminal number of a firstinterpreter by referencing said interpreter registration table from theacquired language type of the callee and language type of the caller; astep of calling the first interpreter terminal by using the extractedterminal number of the first interpreter; a step of extracting aterminal number of a second interpreter by referencing said interpreterregistration table from the acquired language type of the caller andlanguage type of the callee; a step of calling the second interpreterterminal by using the extracted terminal number of the secondinterpreter; a step of calling the callee terminal by using the acquiredterminal number of the callee; a step of transmitting video including atleast video from said callee terminal and audio including at least audiofrom said first interpreter terminal to said caller terminal; a step oftransmitting video including at least video from said caller terminaland audio including at least audio from said second interpreter terminalto said callee terminal; a step of transmitting audio including at leastaudio from said callee terminal to said first interpreter terminal; anda step of transmitting audio including at least audio from said callerterminal to said second interpreter terminal.